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1.1 General introduction 

This paper is devoted to developing a systematic approach to the analysis of the long 
time behaviour of the dynamics of certain mean field spin systems, where by dynamics we 
understand of course a stochastic dynamics of Glauber type. For the purposes of this paper, 
we will always choose this as reversible with respect to the Gibbs measure of the model. By 
long time behaviour we mean that we are interested in time scales on which the phenomena 
of "meta-stability" occur, i.e. time scales that increase with the volume of the system ex- 
ponentially fast. Our primary motivation comes from the study of disordered spin systems, 
and most particularly the so called Hopfield model [Ho,BGl], although in the present paper 
we only illustrate our results in a much simpler setting, that of the random field Curie- Weiss 
(RFCWM) model (see e.g. [Kl]). Our chief objective is to be able to control in a precise 
manner the effect of the randomness on the metastable phenomena. 

On a heuristic level, metastable phenomena in mean field models are well understood. 
The main idea is to consider the dynamics induced on the order parameters by the Glauber 
dynamics on the spin space, i.e. the macroscopic variables that characterize the model. A first 
issue that arises here, and that we will discuss at length below, is that this induced dynamics 
is in general not Markovian. However, one may always define a new Markovian dynamics that 
"mimics" the old one and that is reversible with respect to the measures induced on order 
parameters by the Gibbs measures. This dynamics on the order parameters is essentially a 
random walk in a landscape given by the "rate function" associated to the distribution of the 
order parameters. The accepted picture of the resulting motion is that this walk will spend 
most of its time in the "most profound valleys" of the rate function and stay in a given valley 
for an exponentially long time of order exp(ArAF) where AF is the difference between the 
minimal value of the rate function in the valley and its value at the lowest "saddle point" 
over which the process may exit the valley. An excellent survey on this type of processes 
is given in van Kampen's textbook [vK], although most of the results presented there, and 
in particular all those related to the long time behaviour, concern the one-dimensional case. 
Rather surprisingly, one finds very few papers in the literature that really treat this problem 
with any degree of mathematical rigour. One exception is the classical paper by Cassandro, 
Galves, Olivieri, and Vares [CGOV] (see also [Va] for a broader review on metastability) who 
consider (amongst others) the case of the Curie- Weiss model in which there is only a single 
order parameter and thus the resulting dynamics is that of a one-dimensional random walk. 
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More recently, a particular version of the RFCWM that leads to a two-dimensional problem 
was treated by Mathieu and Picco [MP]. However, there is an abundant literature on two 
types of related problems. One of these concerns Markov chains with finite state space and 
exponentially small transition probabilities. They are treated in the work of Freidlin and 
Wentzell (but see below for a discussion) and have since then been investigated intensely (for 
a small selection of recent references see [Sc,0Sl,0S2,CC,GT]. In the context of stochastic 
dynamics of spin systems, they occur if finite systems are considered in the limit of zero 
temperature.^ A second class of problems, that is in a sense closer to our situation, and that 
can be obtained from it formally by passing to the limit of continuous space and time, is that 
of "small random perturbations of dynamical systems" i.e. a stochastic differential equation 
of the form 

dx'{t) = b{x'{t))dt + ^ea{x^{t))dW{t) (1.1) 

where x^{t) G W^, and W{t) is a d-dimensional Wiener process, and in the case of a reversible 
dynamics the drift term h{x'^{t),e) is given by h{x, e) = VFg(a;), Fe{x) being the rate function. 

The basic reference on the problem (1.1) is the seminal book by Wentzell and Freidlin 
[FW] which discusses this problem (as well as a number of related ones) in great detail. 
Many further references can be found in the forthcoming second edition of this book. One of 
the important aspects of this work is that is devises a scheme that allows to control the long- 
time dynamics of the problem through an associated Markov chain with finite state space 
and exponentially small transition probabilities. The basic input here are large deviation 
estimates on the short time behaviour of the associated processes. This treatment has inspired 
a lot of consecutive works which it is impossible to summarize to any degree of completeness. 
For our purposes, an important development is a refinement of the estimates which in [FW] 
are given only to the leading exponential order in e to a full asymptotic expansion. Relevant 
references are [Kil-4,Az,FJ]. The work of [FJ] in particular is very interesting in that it 
develops full asymptotic expansions to all orders for certain exit probabilities using purely 
analytic techniques based on WKB methods. Very similar results are obtained in [Az] using 
refined large deviation techniques. To our knowledge all the refined treatments that have 
appeared in the literature treat only specific "local" questions, and there seems to be no 
coherent treatment of the global problem in a complicated (multi- valley) situation that takes 
into account sub-leading terms. 

^Let us mention, however, that there has been condiderable work done on the dynamics of spin systems 
on infinite lattices; see inparticular the recent paper by Schonmann and Shlosman [SS] on the metastable 
behaviour in the two-dimensional Ising model in infinite volume, and references therein. 
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The problems we will study require essentially to redo the work of Preidlin and Wentzell 
in the setting of our Markov chains. Moreover, for the problems we are interested in, it will 
be important to have a more precise control, beyond the leading exponential asymptotics, 
for the global problem, if we want to be able to exhibit the influence of the residual random- 
ness. The point is that in many disordered mean field models very precise estimates of the 
large deviation properties of the Gibbs measures are available. Typically, the rate function 
is deterministic to leading order (although not equal to the rate function of the averaged 
system^!) while the next order corrections (typically, but not always, of order N~^/^) are 
random. To capture this effect, some degree of precision in the estimates is thus needed. On 
the other hand, we will not really need a full asymptotic expansion^ of our quantities, and 
we will put more effort on the control of the "global" behaviour than on the overly precise 
treatment of "local" problems. A main difference is of course that we do not have a stochastic 
differential equation but a Markov chain on a discrete state spacc^. Therefore one may draw 
intuition from the proofs given in the continuous case without being able to use any result 
proved in that context directly. Finally, our goal is to give a treatment that is as simple and 
transparent as possible. This is the main reason to concentrate on the reversible case, and 
one of our strategies is to use reversibility to as large an extent as possible. This allows to 
replace refined large deviation estimates by simple reversibility arguments. Large deviation 
estimates are then only used in a less delicate situations. In the same spirit, we will take 
advantage of the discrete nature of the problem whenever this is possible (just to compensate 
for all the disadvantages we encounter elsewhere). This will surprise the reader familiar with 
the continuous case, but we hope she will be convinced at the end that this was a pleasant 
surprise. 

Let us say a final word concerning our preoccupations with the dependence on dimension- 
ality. One of our ultimate goals is to be able to treat, e.g., the Hopfield model in the case 
where the number of order parameters grows with the volume of the system. On the level of 
the mean field dynamics, this requires us to be able to treat a system where the dimension 
of the space grows with the large parameter. Although we will not consider this situation in 
this first paper, we will achieve a precise control of the dimension dependence of sub-leading 
corrections. 

^It is important to keep in mind that the main effect of the disorder manifests itself in a deterministic 
modification of the rate function. This effect is somewhat reminiscent to the phenomenon of homogenization. 

''We beUeve that it is possible to obtain such an expansion for the global problem. However, this will 
require a much more elaborate analysis which we postpone to future publications. 

*The state space is even finite for any N, but its size increases with N, which renders this fact rather 
useless. 
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1.2. The general set-up. 

We will now describe the general class of Markov chains we will consider. Their relation to 
disordered spin systems will be explained in Section 7 and a specific example will be discussed 
in Section 8. Section 7 can be read now, if desired; on the other hand, the bulk of the paper 
can also be read without reference to this motivation. 

We consider canonical Markov chains on a state space Fjv where Tn is the intersection 
of some lattice^ (of spacing 0{1/N)) in with some connected A C M'' which is either 
open or the closure of an open set. To avoid some irrelevant issues, we will assume that A 
is either M.'^ or a bounded and convex subset of M'*. Fjv is assumed to have spacing of order 
1/N, i.e. the cardinality of the state space is of order AT'*. Moreover, we identify Fjv with a 
graph with finite (d-dependent) coordination number respecting the Euclidean structure in 
the sense that a vertex x eTn is connected only to vertices at Euclidean distances less than 
c/N from x. The main example the reader should have in mind is r^r = U^' jN n A, with 
edges only between nearest neighbors. We denote the set of edges of Fjv by E{Tn)- 

Let Qjv be a probability measure on {Tn,B{Tn)). We will set, for x G Fjv, 

FN{x) = -^lnQN{x) (1.2) 

We will assume the following properties of Fm{x). 
Assumptions: 

Rl F = limATioo Fn exists and is a smooth function A — > R; the convergence is uniform in 
compact subsets of intA. 

R2 Fn can be represented as Fn = Fjv,o + j^Fn,i where F^fi is twice Lipshitz, i.e. |Fjv,o(a^) — 
FNfi{y)\ < C\\x — y\\ and for any generator of the lattice, k, 

N\FMfi{x) - Fn,o{x + k/N) - {FN,o(.y) - i^iv,o(y + k/N)\ < C\\x - y\\, with C uniform on 
compact subsets of the interior of A. Fjv,i is only required to be Lipshitz, i.e. |F/v,i(a;) — 
FN,i{y)\<C\\x-y\\. 

For the purposes of the present paper we will make a number of assumptions concerning 
the functions Fn which we will consider as "generic" . An important assumption concerns the 

^The requirement that Fjv is a lattice is made for convenience and can be weakened considerably, if desired. 
What is needed are some homogeneity and rather minimal isotropy assumptions. 
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structure of the set of minima of the functions F^. We will assume that the set A4]\r C Tjy, 
of local minima of Fn is finite and of constant cardinality for all large enough, and that 
the sets M n converge, as N tends to infinity, to the set M of local minima of the function 



Another set of points that will be important is the set, Sn, of "essential" saddle points 
(i.e. the lowest saddle points one has to cross to go from one minimum to another). Formally, 
we define the essential saddle, z*{x,y), between two minima x,y e Mn SlS 



where the infimum is over all paths 7 : [0, 1] — >■ Fjv going from y to x.^^ with jumps along 
the edges of Tjv only^^ 

A point z is called an essential saddle point if there exist minima x,y G Mn such that 
z*{x,y) = z. The set of all essential saddle point will be denoted by 6^. Our assumptions 
on the N dependence of the set Mn apply in the same way to 6n. 

Gl We will assume that there exists a > such that min-r^^g^vfj^ufiv \Fn{x) — -^Ar(y)| = 



G2 We assume that at each minimum the eigenvalues of the Hessian of F are strictly positive 
and at each essential saddle there is one strictly negative eigenvalue while all others are 
strictly positive. 

G3 All minima and saddles are well in the interior of A, i.e. there exists a 5 > such that for 
any x e Mn ^£n, dist(x, A*^) > 6. 

Remark: We make the rather strong assumptions above in order to be able to formulate 
very general theorems that do not depend on specific properties of the model. They can 
certainly be relaxed. The regularity conditions R2 are necessary only for the application of 
certain large deviation results in Section 4 and are otherwise not needed. 

""^"^This assumption can easily be relaxed somewhat. For example, it would be no problem if the function 
F]\f is degenerate on a small set of points in the very close (order N~^/^) neighborhood of a minimum. One 
then would just choose one of them to represent this cluster. Other situations, e.g. when the function F has 
local minima on large sets and would lead to new effects would require special treatments. 

^^Note that here we think of a path as a discontinuous (cadlag) function that stay at a site in F^v for 
some time interval St and then jumps to a neighboring site along an edge of Fjv". This parametrization will 
however be of no importance and allows just some convenient notation. 

^^We will extend the definition (1.3) also to general points x,y £ Fjv- In that case it may happen that 
z* {x, y) is not a saddle point, but one of the endpoints x or y itself. 




(1.3) 



Kn > N' 
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We recall that in our main applications, Qjv will be random measures, but we will forget 
this fact for the time being and think of Qn as some particular realization. 

We can now construct a Markov chain X^it) with state space given by the set of vertices 
of Fjv and time parameter set either N or R_|_. For this we first define for any x,y such that 
{x,y) G E{rN) transition rates 



for some non-negative, symmetric function /jv- We will assume that /at does not introduce 
too much anisotropy. This can be expressed by demanding that 

R3 There exists c > such that if {x,y) £ E{Tn), and dist(x,A'^) > 6/2, (where 6 is the 
same as in assumption G3) pN{x,y) > c. 

Moreover, for applications of large deviation results we need stronger regularity properties 
analogous to R2. 

R4 ln/jv(a;,y) as a function of any of its arguments is uniformly Lipshitz on compact 
subsets of the interior of A. 



For the case of discrete time, i.e. i G N, we then define the transition matrix 
PN{x,y) = < 



' PN{x,y), if (x,y) G E{Tn) 

^-T.zeT^:{x,z)eE{r^)PN{x,z), if x = y (1.5) 
0, else 



choosing / such that sup^^r^ 'ZzeT^:{x,z)eE{VM)PN{x,y) < 1- 

Similarly, in the continuous time case, we can use the rates to define the generator 

' PN{x,y), if {x,y)eE{VN) 

AN{x,y) = I -Y.zev^:{x,z)eE{v^)PN{x,y), if x = y (1.6) 
0, else 

Our basic approach to the analysis of these Markov chains is to observe the process when 
it is visiting the positions of the minima of the function Fjv, i.e. the points of the set M.N-, 
and to record the elapsed time. The ideology behind this is that we suspect the process 
to show the following typical behaviour: starting at any given point, it will rather quickly 
(i.e. in some time of order N^) visit a nearby minimum, and then visit this same minimum 
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at similar time interval an exponentially large number of times without visiting any other 
minimum between successive returns. Then, at some random moment it will go, quickly 
again, to some other minimum which will then be visited regularly a large number of times, 
and so on. Moreover, between successive visits of a minimum the process will typically not 
only avoid visits at other minima, but will actually stay very close to the given minimum. 
Thus, recording the visits at the minima will be sufficient information on the behaviour 
of the process. These expectations will be shown to be justified (see in particular Section 
7). Incidentally, we mention that the "quick" processes of transitions can be analysed in 
detail using large deviation methods [WFl-4]. In [BG2] a large deviation principle is proven 
for a class of Markov chains including those considered here that shows that the "paths" 
of such quick processes concentrate asymptotically near the classical trajectories of some 
(relativistic) Hamiltonian system. More precisely, the transitions between minima can be 
identified as instanton solutions of the corresponding Hamiltonian system. 

Let us mention that the strategy to record visits at single points is specific to the discrete 
state space. In the diffusion setting, visits at single points do not happen with sufficient 
probability to contain pertinent information on the process. Indeed, the crucial fact we use 

is that in the discrete case it is excessively difficult for the process to stay for a time of order 
A^'^ (we will discuss the vahics of k later) in the vicinity of a minimum without visiting it^^ 
which in the continuum is not the case. For this reason Preidlin and Wentzcll record visits 
not at single points but at certain neighborhoods of minima and critical points which has 
the disadvantage that such visits do not exactly allow a splitting of the process and this 
introduces some error terms in estimates which in our setting can easily be avoided. This is 
the main advantage we draw from working in a discrete space. 

The informal discussion above will be made precise in the sequel. We place ourselves 
in the discrete time setting throughout this paper, but everything can be transferred to 
the continuous time setup with mild modifications, if desired. Let us first introduce some 
notation. We will use the symbol P for the law of our Markov chain, omitting the explicit 
mention of the index N, and denote by Xt the coordinate variables. We will write for the 
first time the process conditioned to starting at y hits the point x, i.e. we write 

P [ry = t] = ¥[Xt = x,yo<s<tXs / x\Xo = y] (1.7) 

""^^The reader may wonder at this point why the minima are so special compared e.g. with their neighboring 
points. In fact they are not, and nothing would change if we chose some other point close to the minimum 
rather than the exact minimum. But of course the minima themselves are the optimal choice, and also the 
most natural ones. 
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for t > 0. In the case x = y, wc will insist that is the time of the first visit to y after 
t = 0, i.e. P[t^ = 0] = 0. This notation may look nnusual at first sight, but we are convinced 
that the reader will come to appreciate its convenience. 

One of the most useful basic identities which follows directly from the strong Markov 
property and the fact that r| is a stopping time is the following: 

Lemma 1.1: Letx,y,z be arbitrary points inVp^. Then 

+ ^[r^ = s,Ty<Ty]n< = t-s] (1-8) 

0<s<t 

Proof: jTist note that the process either arrives at x before visiting z, or it visits z a first 
time before x. <C> 

A simple consequence is the following basic renewal equation. 

Lemma 1.2: Letx,y&TN. Then 

oo n 

^[r^ = t] = Y. E \{^[r'y=u,Ty<Ty]^[Ty = t^+,,Ty<T,^ 



ra=0 *i. - *n+i i=l 



y\ 

y^ (1.9) 



The fundamental importance in the decomposition of Lemma 1.2 lies in the fact that 
objects like the last factor in (1.9) are "reversible", i.e. they can be compared to their time- 
reversed counterpart. To formulate a general principle, let us define the time-reversed chain 
corresponding to a transition from y to x via = X^y_^. For an event A that is measurable 
with respect to the sigma algebra T (X^, < s < r^) we then define the time reversed event 
as the event that takes place for the chain Xt if and only if the event A takes place for 
the chain X^. This allows us to formulate the next lemma: 

Lemma 1.3: Let x,y E Fjv, and let A be any event measurable with respect to the sigma 
algebra J^{Xs,0 < s < r^). Let A!' denote the time reversion of the event A. Then 

QN{y)F [A,ry < ry] = Q;v(x)P [A',t^ < r^] (1.10) 



For example, we have 

Qiv(2/)P [ry = t,Ty < ry] = Qn{x)F [t^ = t,T^ < r|] (1.11) 
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Of course the power of Lemma 1.3 comes to bear when x and y are such that the ratio 
between QAr(x) and Qjv(j/) is very large or very small. 

Formulas hke (1.9) invite the use of Laplace transforms. Let us first generalize the notion 
of stopping times to arrival times in sets. I.e. for any set / C Fjv we will set rf to be the 
time of the first visit of the process, starting at x, to the set /. With this notion we define 
the corresponding Laplace transforms 

^ E ^""'^ = ^ ^/ ] ^ ^ [«""^' ^rS<r^] ' " e C (1-12) 

t>0 

(We want to include the possibility that / contains x and/or y for later convenience). As 
we will see it is important to understand what the domains of these functions are. Since the 
Laplace transforms defined in (1.12) are Laplace transforms of the distributions of positive 
random variables, all these functions exist and are analytic at least for all n G C with 
Re{u) < 0. Moreover, if j{uo) is finite for some uq G then it is analytic in the 
half-space Re{u) < uo- As we will see later, each of the functions introduced in (1.12) will 
actually exist and be finite for some uq > (depending on N). 

Note that in particular 

G^,7(0)=F[r|<rJ] (L13) 

and 

^Gl^jiu = 0) ^ Gl^.iO) = E [r,^II..<..] (1.14) 

The expected time of reaching x from y conditioned on the event not to visit I in the meantime 
is expressed in terms of these functions as 

^|^=E[r,^K^<rJ] (1.15) 

An important consequence of Lemma 1.3 is 
Lemma 1.4: Assume that I is any subset o/Fjv containing x and y. Then 

qN{y)Gl^j{u) = qN{x)Glj{u) (1.16) 



Proof: Immediate from Lemma 1.3. ^ 
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Lemma 1.4 implies in particular that the Laplace transforms of the conditional times are 
invariant under reversal, i.e. 

and in particular 

E [r'Jry < rf] = E [r^\T^ < rf] (1.18) 

(1.17) expresses the well-known but remarkable fact that in a reversible process the con- 
ditional times to reach a point x from y without return to y are equal to those to reach y 
from X without return to x. 

A special role will be played by the Laplace transforms for which the exclusion set are all 
the minima. We will denote these by g'^{u) = G^^j^('w)- Indeed, we think of the events 
{r^ < Tjl^^}, for x,y e Mn, as elementary transitions and decompose any process going from 
one minimum to another into such elementary transitions. This gives for G^(u) = G| ^.(u): 

Lemma 1.5: Let x,y G Mn- Denote by uj an arbitrary sequence 

uj = a;o,a;i,a;2,a;3, . . . ,(jO\qj\ of elements coi E Mn- Then we have 

Gi{u)= y: ^'HnC^ (1-19) 



where 

p(a;)^n^[^^r (1-20) 

i=l 

and oj : y ^ x indicates that the sum is over such walks for which loq = y and a;|(^| = x, and 
u)i ^ X for all < i < \u;\. 

Lemma 1.5 can be thought of as a random walk representation of our process as observed 
on the minima only. As we will show soon, the quantities m!- , , , are rather harmless, i.e. 
they do not explode in a small neighborhood of zero, and e.g. their derivative at zero is 
at most polynomially large in N. On the other hand, we will also see that the "transition 
probabilities" P [r'i^l''^ < t^~^] are all exponentially small provided that a;j_i ^ Wj. This 
means that a typical walk will contain enormously long "boring" chains of repeated returns to 
the same point. It is instructive to observe that these repeated returns to a given minimum 
can be re-summed, to obtain a representation in terms of walks that do not contain zero 
steps: 
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Lemma 1.6: Let x.y G A^at. denote by uj a sequence ui = ujo,uji,uj2,uj3, ■ ■ ■ ,uj\u;\ of 
elements coi & Mn such that for all i, uii ^ Then we have 

where 



1 F fr'^'"' < r'^*"' 



The reason for writing Lemma 1.6 in the above form is that it entails as a corollary the 
following expression for the expected transition time: 

-i=E*)|(Tmy-f|y) 

Note that p{Cj) has indeed a natural interpretation as the probability of the sequence of steps 
u, while each term in the sum is the expected time such a step takes. Moreover, this time 
consists of two pieces: the first is a waiting time which in fact arises from the re-summation 
of the many returns before a transition takes place while the second is the time of the actual 
transition, once it really happens. Note that the first term is enormous since the denominator, 
1 — gZ"~.\ (0) = P "tX^^Xw -1 "^"^i-i ' ^ ^® ^^^^ exponentially small. 

Remark: Lemma 1.6 does provide a representation of the process on the minima in terms 
of an embedded Markov chain with exponentially small transition probabilities. Moreover, 
we expect that for N large, the waiting times will be almost exponentially distributed (but 
with very different rates!), while transitions happen essentially instantaneously on the scale 
of even the fastest waiting time. This is the analogue of the controlling Markov processes 
constructed in Preidlin and Wentzell (see in particular Chap. 6.2 of [FW]). 

In the case where A^at consists of only two points, Lemma 1.6 already provides the full 
solution to the problem since the only walk left is the single step {y,x). 

Corollary 1.7: Assume that Mn = {x,y}. Then 
and 
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Proof: Just use that in this particular setting, 1 — g^{0) = 51(0). 
Remark: (1.25) can be written in the maybe more instructive form 

^^y_ i-9m m+m (126) 

As we will see, all the ratios of the type g{0)/g{0) represent the expected times of a transition 
conditioned on the event that this transition happens and should be thought of as "small". 
On the other hand, the probability 51(0) will be shown to be exponentially small so that the 
first factor in the first term in (1.26) is extremely large. Thus, to get a precise estimate on 
the expected transition time in this case, it suffices to compute precisely the two quantities 
g^{0) and 5^(0) only (the second term in (1.26) being negligible in comparison). One might 
be tempted to think that in the general case the random walk representation given through 
Lemma 1.6 would similarly lead to a reduction to the problem to that of computing the 
corresponding quantities at and between all minima. This however is not so. The reason 
is that the walks u still can perform more complicated multiple loops and these loops will 
introduce new and more singular terms when appear explicitly in (1.21) and (1.23). This 
renders this representation much less useful than it appears at first sight. On the other hand, 
the structure of the representation of Corollary 1.7 will be rather universal. Indeed, it is easy 
to see that with our notations we have the following 

Lemma 1.8: Let I cTn. Then for all y ^ I U x, 

^y,{IUx} 

holds for all u for which the left-hand side exists. 



Proof: Separating paths that reach x from y without return to y from those that do return, 
and splitting the latter at the first return time, using the strong Markov inequality, we get 
that 

GUu) = Gl^,^^^{u) + Gl^,^^^{u)Gl,{u) (1.28) 

By construction, if j(n) is finite, the second summand being less than the left-hand side, 
we have that {/ux}(^) ^ ®° (1-27) follows. <0 

Lemma 1.8 will be one of our crucial tools. In particular, since it relates functions with 
exclusion sets / to functions with larger exclusion sets, it suggests control over the Laplace 
transforms via induction over the size of the exclusion sets. 
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Lemma 1.8 has two important consequences that are obtained by setting u = in (1-27) 
and by taking the derivative of (1.27) with respect to u and evaluating the result at u = 0: 

Corollary 1.9: Let / C Tjv- Then for all y ^ I Ux, 



lUy 



(1.29) 



and 



E [-^yl-^y < -^j] =E [^yK < ^Ju.] + -^fz^ '-^^ [r^ < rJuy] (1.30) 



1.3. Outline of the general strategy. 



As indicated above, an important tool in our analysis will be the use of induction over the 
size of exclusion sets by the help of Lemmata 1.1 and 1.8. One of the basic inputs for this will 
be a priori estimates on the quantities Qyiu). These will be based on the representation of 
these functions as solutions of certain Dirichlet problems associated to the operator (1— e^Pjv) 
with Dirichlet boundary conditions in set containing M n ■ 

The crucial point here is to have Dirichlet boundary conditions at all the minima of -F/v 
and at y. Without these boundary conditions, the stochastic matrix P is symmetric in the 
space •^2(rAr)QAr) and has a maximal eigenvalue 1 with corresponding (right) eigenvector 1; 
since this eigenvector does not satisfy the Dirichlet boundary conditions at the minima, the 
spectrum of the Dirichlet operator lies strictly below 1, so that for sufficiently small values 
of 1 — is invertible. It is essential to know by how much the Dirichlet conditions push 
the spectrum down. It turns out that Dirichlet boundary conditions at all the minima push 
the spectrum by an amount of at least CN~'^~^ below one, and this will allow us not only to 
construct the solution but to get very good control on its behaviour. If, on the other hand, 
not all the minima had received Dirichlet conditions, we must expect that the spectrum is 
only pushed down by an exponentially small amount, and we will have to devise different 
techniques to deal with these quantities. 

As a matter of fact, while the spectral properties discussed above follow from our esti- 
mates, we will not use these to derive them. The point is that what we really need are 
pointwise estimates on our functions, rather than £2 estimates, and we will actually use more 
probabilistic techniques to prove £00 estimates as key inputs. The main result, proven in 
Section 3, will be the following theorem: 
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Theorem 1.10: There exists a constant c > such that for all x G T^, y G Mn the 
functions gy{u) are analytic in the half-plane Re(u) < cN~'^~^^'^ . Moreover, for such u, for 
any non-negative integer k there exists a constant Ck such that 



d 



k 



< c^iv''(''+^/2)+'^/2e^l^^(^)"-^^(^*('''^))l (1.31) 



where z*{y,x) is defined in (1.3). 

These estimates are not overly sharp, and there are no corresponding lower bounds. There- 
fore, our strategy will be to use these estimates only to control sub-leading expressions and 
to use different methods to control the leading quantities which will be seen to be certain of 
the expected return times, like and the transition probabilities P [r^ < r^] . The latter 
quantities will be estimated up to a multiplicative error of order N^^^ in Section 2. In fact 
we will prove there the following theorem: 

Theorem 1.11: With the notation of Theorem 1.10 there exists finite positive constants 
c, C such that if x ^ y E Mn, then 

P [t^ < tI\ < cAf^e-^[^^(^'(2''"))-^^(2')] (1.32) 

and 

P [r^ < tI\ > CAr^e-^[^^(-*(^'-))-^^(^)] (1.33) 



The estimates for the return times require some more preparation and will be stated only 
in Section 5, but let us mention that the main idea in getting sharp estimates for them is the 
use of the ergodic theorem . 

Equipped with these inputs wc will, in Section 4, proceed to the analysis of general tran- 
sition processes. We will introduce a natural tree structure on the set of minima and show 
that any transition between two minima can be uniquely decomposed into a sequence of 
so-called "admissible transitions" in such a way that with probability rapidly tending to one 
(as N I oo), the process will consist of this precise sequence of transitions. This will require 
large deviation estimates in path space that are special cases of more general results that 
have recently been proven in [BG2]. 

In Section 5 we will investigate the transition times of admissible transitions. In the first 
sub-section we will prove sharp bounds on the expected times of such admissible transitions 
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with upper and lower bounds differing only be a factor of N^/'^. This will be based on 
more general upper bounds on expected times of general types of transitions that will be 
proven by induction. In the second sub-section we show that the rescaled transition times 
converge (along subsequences) to exponentially distributed random variables. This result 
again is based on an inductive proof establishing control on the rather complicated analytic 
structure of the Laplace transforms of a general class of transition times. In Section 6 we use 
these results to derive some consequences: We show that during an admissible transition, at 
any given time, the process is close to the starting point with probability close to 1, that it 
converges exponentially to equilibrium, etc. Section 7 motivates the connection between our 
Markov chains and Glauber dynamics of disordered mean field models, and in Section 8 we 
discuss a specific example, the random field Curie- Weiss model. 

Notation: We have made an effort to use a notation that is at the same time concise and 
unambiguous. This has required some compromise and it may be useful to outline our policy 
here. First, all objects associated with our Markov chains depend on N. We make this 
evident in some cases by a subscript N. However, we have omitted this subscript in other 
cases, in particular when there is already a number of other indices that are more important 
(as in Gy(u)), or in ever recurring objects like P and E, and which sometimes will have to be 
distinguished from the laws of modified Markov chains by other subscript. Constants c, C, k 
etc. will always be understood to depend on the details of the Markov chain, but to be 
independent of for N large. There will appear constants K]\f > that will depend on A'^ 
in a way depending on the details of the chain, but such that for some a > 0, N^~"Kf^ ] oo 
(this can be seen as a requirement on the chain). Specific letters are reserved for a particular 
meaning only locally in the text. 

Acknowledgements: A.B. would like to thank Enzo Olivieri for an inspiring discussion 
on reversible dynamics that has laid the foundation of this work. M. K. thanks Johannes 
Sjostrand and B. Helffer for helpful discussions. The final draft of the paper has benefited from 
comments by and discussions with Enzo Olivieri, Elisabetta Scoppla, and Prancesca Nardi. 
Finally, V.G. and A.B. thank the Weierstr ass-Institute, Berlin, and the Centre de Physique 
Theorique, Marseille, for hospitality and financial support that has made this collaboration 
possible. 
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Section 2 



2. Precise estimates on transition probabilities 

In this section we proof Theorem 1.11. A key ingredient is the following variational repre- 
sentation of the probabilities 

Gy{o) = F[Ty<Ty], x,yeMN (2.1) 

that can be found in Ligget's book ([Li], p. 99, Theorem 6.1). 
Theorem 2. 1 : [Li] Let Hy denote the space of functions 

Hl = {h:TN^ [0, 1] : h{y) = 0, h{x) = 1} (2.2) 

and define the Dirichlet form 

^N{h)= '^N{x')pM{.x',x")[h{x')-h{x")f (2.3) 

x',x"erjv 

Then 

P Wy < ry] = inf ^N{h) (2.4) 

Proof: Sec Liggct [Li], Chapter II.6. Note that the set R in Liggett's book (page 98) will 
be rjv\{a^}, and our Hy would be i?rAr\{a;} in tiis notation. <^ 

Proof of Theorem 1.11.: The proof of the upper bound (1.32) is very easy. We will just 
construct a suitable trial-function h G Hy and bound the infimum in (2.4) by the value of 
the Dirchlet form $jv at this function. 

To this end we construct a 'hyper-surface'^^ Sn CTn separating x and y such that 

i) z*{y,x) e Sn- 

ii) \fz e Sn, Fn{z) > FN{z*{y,x)). 

Sn splits r^r into two components and Ty which contain x and y, respectively. Let 
X e ([0, go), [0, 1]) with x{s) = 1 for s € [0, 1) and x(s) = for s > 2. Then put 

, ,M r x(iVV2dist(a:',5Ar)), ioix'eT, 
I, 1, forx G Ty 



^^We actually require no analytic properties for the set iSjv and the term hyper-surface should not be 
taken very seriously. 
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Clearly, is constant outside a layer jCj^r of width A/'"^/^ around Sn- Since the transition 
matrix pn{x',x") vanishes for \x' — x"\ > CN~^, for some constant C < oo and since by 
construction 

\hN{x') - hN{x")\ < cN^/^\x' - x"\ (2.6) 

we obtain 

<const.N-' E ^|yl"'(^>"' (2.7) 

x' ,x" (IzCn 

^ C QN(.z*{x,y)) Qjv(xO 
- TV Qn{x) f-i Qjv(^;*(x,2/)) 

Since -P/v is assumed to have a quadratic saddle point at z*{x,y), using a standard Gaussian 
approximation the final sum is readily seen to be 0{N'^/^) which gives the upper bound 
(1.32). 

The main task of this section will be to establish the corresponding lower bound (1.33). The 
main idea of the proof of the lower bound is to reduce the problem to a sum of essentially 
one-dimensional ones which can be solved explicitly. The key observation is the following 
monotonicity property of the transition probabilities. 

Lemma 2.2: Let A c Fjv be a subgraph o/Fjv and let Pa denote the law of the Markov 
chain with transition rates 

{Pn{x',x"), if x' 7^ x" , and {x',x") G E{A) 

'^-T.y':{x',y')eE{A)PN{x',y'), if x' = x" (2.8) 

0, else 

Assume that y,x G A. Then 

P [ry < ry] > Pa [t,^ < r,^] (2.9) 



Proof: To prove Lemma 2.2 from here, just note that for any h G 



'^N{h)> Yl QN{x')pN{x',x")[h{x')-h{x")f 



(x',x")eA 

2.10) 

Qiv(A) QA{x')pA{x',x")[h{x') - h{x")]^ 

{x',x")eA 
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where Qa{x) = Qn{x)/Qn{^)- This imphes immediately that 

inf $iv(/i) > Qiv(A) inf $A(/i) = QtvCA) inf $A(/i) (2.11) 

where H^iA) = {h : A ^ [0,1] : h{y) = 0, h{x) = 1}. Thus, using Theorem 2.1 for the pro- 
cess Pa, we see that 

p Wy < T^] = , inf ^N{h) > -J^ inf ^^ih) = Pa [t^ < r^] (2.12) 

which proves the lemma. <^ 

To make use of this lemma, we will choose A in a special way. Note that the simplest 
choice would be to choose A as one single path connecting y and x over the saddle point 
z*{y,x) in an optimal way. However, such a choice would produce a bound of the form 
CN~'^^'^ exp {—NFN{z*{y, x)) — -FAr(y)) which differs from the upper bound by a factor N'^/'^. 
It seems clear that in order to improve this bound we must choose A in such a way that it 
still provides "many" paths connecting y and x. To do this we proceed as follows. Let E be 
any number s.t. FN{z*{y,x)) > E > max (Fjv(y), PAr(a;)) (e.g. choose E = FN{z*{y,x)) — 
I {FN{z*{y,x)) — max. {Fn (y) , Fn (x))) . Denote by Dy, the connected components of the 
level set {x' G Fjv : Fn{x') < E} that contain the points y, resp. x. 

Note that of course we cannot, due to the discrete nature of the set Fn, achieve that the 
function F^ is constant on the actual discrete boundary of the sets Dy, D^. The discrete 
boundary dD of any set D C Fjv, will be defined as 

dD = {xe D\3y G rN\D , s.t. (x, y) G E{rN)} (2.13) 

We have, however, that 

sup \Fn{x') - Fn{x")\ < CN-^ (2.14) 

Next we choose a family of paths 7^ : [0, 1] — > T jv , indexed hy z ^ B C Sn with the 
following properties: 

i) 7z(0) G dDy, 7,(1) G dD, 

ii) For z ^ z', jz and 7,' are disjoint (i.e. they do not have common sites or common edges. 

iii) Fjv restricted to 7, attains its maximum at z. 
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Of course we will choose the set B C Sn to be a small relative neighborhood in Sn of the 
saddle z*{y, x). In fact it will turn out to be enough to take B a disc of diameter CN~^^'^ so 
that its cardinality is bounded by \B\ < CN'^'^'^^Z'^ . 

For such a collection, we will set 

A = D^UDyU \JV{^,) (2.15) 
zeB 

where V{'yz) denotes the graph composed of the vertices that 7^ visits and the edges along 
which it jumps; the unions are to be understood in the sense of the union of the corresponding 
subgraphs of F at . 



Lemma 2.3: With A defined above we have 



Pa < Tl\ > (1 - ciV<^/2e-iv[F.(.*(,,x))-i;]^ ^ 



^7.(0) ^ ^7.(1) 



(2.16) 



Proof: All paths on A contributing to the event |r|' < r^} must now pass along one of the 
paths 7^. Using the strong Markov property, we split the paths at the first arrival point in 
Dx which gives the equality 



Pa < Tl\ = 5; Pa 



7.(1) 



< T 



Pa 



r7.(l) 



< r, 



7.(1) 



(2.17) 



By reversibility. 



7.(1) - ^D^\Jy 



Qjv(7.(l)) 
Qiv(2/) 

Qiv(7.(l)) 
Qiv(y) 



Pa 
Pa 



,7.(1) < ^7.(1)' 



,7.(1) < ,7.(1) 
^7.(0) - 



Pa 



7.(0) < ,7.(0) 



(2.18) 



where in the last line wc used that the path going from 7^(1) to y without further visits to 
Dx must follow 72- Note further that wc have the equality 



Pa 



,7.(1) < ,7.(1) 
^7,(0) ^ ^D, 



,7.(1) < ,7.(1) 
^7.(0) - ^7.(1) 



(2.19) 



where the right hand side is a purely one-dimensional object. We will now show that the 



probabilities Pa 
see this, write 



.7.(1) 



< r. 



7.(1) 



and Pa 



,7.(0) < 7.(0) 



are exponentially close to 1. To 



r7.(l) 



7.(1) 



r7.(l) 



< r. 



7.(1) 



.7.(1) 



< T, 



7.(1) 



a;U7;,(l) 



1-Pa 



,7.(1) . ,7.(1) 

7.(1) ^ ^^"-^y 



(2.20) 
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where the second equality follows from Corollary 1.9. Now by reversibility, the numerator in 
(2.20) satisfies the bound 



Pa 



7.(1) / ^7.(1) 

^ ^xU7.(i) 



< 



x'eB 



,7.(1) ^ ^7.(1) 
^x' <■ ^xU7.(l) 



On the other hand, 



1-Pa 



< \B\ 



,7.(1) ^ ,7.(1) 

^7.(1) ^ ^""^y 



\iz*iy,x)) 
^a(7z(1)) 



\B\ 



^N{z*iy,x)) 
Qiv(7z(l)) 



= Pa 
> Pa 



> P7 



,7.(1) ^ ,7.(1) 

■^7.(1) < ^7.(1)' 
^^7.(1) 

^7.(1) < ^7.(1) 



^7.(1) < ^7.(1) 



Thus we get that 



^7.(1) < ^7.(1) > 1 _ (J]Sld/2^-N[FN{z*{y,x))-E] 

X y 



By the same procedure, we get also that 

Pa 



^7.(0) < ^7.(0) y ^ _ (jjyd/2^-N[FN(z'-{y,x))-E] 

y 



(2.21) 



(2.22) 



where 7 is a a one dimensional path going from 7z(l) to x. We will show later that 



(2.24) 



(2.25) 



Putting all these estimates together, we arrive at the affirmation of the lemma.O 

We are left to prove the lower bounds for the purely one-dimcnsional problems whose 
treatment is explained for instance in [vK]. In fact, we will show that 

Proposition 2.4: Let 7^ be a one dimensional path such that attains its maximum on 
7z at z. Then there is a constant < C < 00 such that 



,7.(1) ^ ,7.(1) 
7.(0) ^ W.(l) 



> CAr-V2g-Af[Fjv(^)-F7vf(7.(l))] 



(2.26) 



Proof: Let K = denote the number of edges in the path 7^. Let us fix the notation 
. . . ,ujk, for the ordered sites of the path 7^, with 7z(l) = a;o,7z(0) = 
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For any site a;„ we introduce the probabilities to jump to the right, resp. the left 

p{n) = PNiiOn,'^n+l), q{n) = PN{l^n-,^n-l) {2.27) 

We will first show that 



Liemma 2.5: With the notation introduced above, 



^7.(0) ^ ^7.(1) 



E 

.n=l 



'jv(^o) 1 
{un) q{n) 



IN 



(2.28) 



Proof: Let us denote by r{n) the solution of the boundary value problem 

r{n){p{n) + q{n)) = p{n)r{n + 1) + q{n)r{n — 1), for < n < K 
r(0) = 0, r{K) = 1 
Obviously we have that 



,7.(1) ^ ,7.(1) 
7.(0) ^S.(l) 



:p(0)r(l) 



(2.29) has the following well know unique solution 

r{n) = 



Z^k = lil£=k q(e) 

s^K T-fK-l pji) 
Z^k=llU=k q{C) 



hence, 



,7.(1) . ,7.(1) 
^7.(0) ^ S.(l) 



p(0) 



p(o)nt-/f% 

Y[K-l p(e) sr^K prfc-l g(Q 



Now reversibility reads QAr(a;^)p(^) = Qjv(^^^+i)g(^ + 1), and this allows to simplify 



k-l 

n 



pji) ^ g(fc)Qjv(a;fc) 
Li q{£) g(l)Qiv(a;i) 



and finally 



■ 7. 



,7.(1) . ,7.(1) 
7.(0) ^ W.(l) 



which is the assertion of the lemma. 



Qiv('^o)Ef=i[9WQiv(a;fe)] ' 



We are left to estimate the sum Qn{^o) Sfc=i q(k)(}N(ujk) uniformly in K. Since 
c > for all 1 < k < K, for an upper bound on this sum it is enough to consider 



(2.29) 



(2.30) 



(2.31) 



(2.32) 



K 



EN 



k=l 



IN 



1 

{l^k) 



QivM 
Qjv(2) 



K 

fc=i 



-Ar[Fjv(z)-Fjv(wfc)] 



(2.33) 



(2.34) 



m > 



(2.35) 
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Now in the neighborhood of z, we can certainly bound 

FN{z)-FN{LOk)>c(^^^ (2.36) 

while elsewhere Fn{z) — F]\i{uJk) > e > (of course nothing changes if the paths have to 
pass over finitely many saddle points of equal height), and from this it follows immediately 
by elementary estimates that uniformly in K 

K 

^^-N[FN{z)-Fi^{ujk)] < (7/^^/2 (2.37) 
fc=l 

which in turn concludes the proof of Proposition 2.4}^ <0><O 

Combining Proposition 2.4 with Lemma 2.3, we get that 

Pa Wy <Ty]>(l- CiV'^/2e-iv[F.(z*(,,x))-£;]\ y QNil.il)) Qn{z) ../^ 
yi-\ ; Q^(y) Q^r(7z(l)) 

^ g-iV[Fjv(z*(y,x))-FAr(y)] _ (j j^d/2^-N[FN{z'' {y,x))-E]^ CN''^/'^ ^ ^- N[Fn{z)-Fn{z'' {y,x))] 

zeB 

(2.38) 

By our assumptions (z) — (z* {y, x)) restricted to the surface Sn is bounded from above 
by a quadratic function in a small neighborhood of z*{y, x) and so, if B is chosen to be such a 
neighborhood, the lower bound claimed in Theorem 1.11 follows immediately by a standard 
Gaussian approximation of the last sum. 



3. Laplace transforms of transition times in the elementary situation 

In this section wc shall prove Theorem 1.10, which is our basic estimate for the Laplace 
transforms of elementary transition times. Wc shall need the sharp estimates on the transition 
probabilities which wc obtained in the previous section based on Lemma 2.2. Combined with 
reversibility they lead to an estimate on the hitting time t^^. This is the basic analytic 
result needed to estimate the Laplace transforms, using their usual representation as solutions 
of an appropriate boundary value problem. Let us recall the notation 

Gl^iu) = E \e^-y , g-{u) = G^j^^ (n) 



^^Of course we could easily be more precise and identify the constant in (2.37) to leading order with the 
second derivative of F{z) in the direction of 7 (see e.g. [vK] where this computation is given in the case of 
the continuum setting, and [KMST] where a formal asymptotic expansion is derived in the discrete case). 
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In this section S will always denote a proper nonempty subset of F^v that contains Mn- 
Moreover, we will assume that y is not in the interior of S, i.e. it is not impossible that y is 
reached before from x, since otherwise Gy -^{u) = trivially. 

To prove Theorem 1.10, it is enough to show that 

for real and positive u < cN~'^~^f'^ . Note that z*{x,y) is defined in (1.3) in such a way that 
z*{x, y) equals to x if y can be reached from x without passing a point at which F^r is larger 
than Ff^{x). Analyticity then follows since gy{u) is a Laplace transform of the distribution 
of a positive random variable, and the estimates for > 1 follow using Cauchy's inequality. 

In the sequel we will fix y G S and AlArCScFjv. It will be useful to define the function 

' Gl^^iu) for x^S 
Vu{x) = S 1 for X = y (3.2) 
^ for X G 

As explained in the introduction, v„(x) is analytic near u = Q (so far without any control in 
N on the region of analyticity). 

Similarly, we define the function 

, , inrli^] for xiMN ^ ^ 

wo{x) = { (3.3) 
[0 for x ^ Mn 

Observe that as a consequence of Lemma 1.1 of the introduction we get (for any x,y G 
rjv,S C Vn) that 

G^,eW =e"Piv(x,2/)+e-5]PA,(x,z)G^,sM- (3-4) 

Using this identity one readily deduces that Vu is the unique solution of the boundary value 
problem 

{l-e^PN)vu{x) = ^ (x^S), Vu{y) = l, Vu{x) = Q {x^^\y). (3.5) 
and, in the same way, Wu is the unique solution of 



{I - Pn)wo{x) = I (x^Mn), wo{x)=0 {xeMn)- 



(3.6) 
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We shall use these auxiliary functions to prove the crucial 
Lemma 3. 1 : There is a constant C G M such that for all N large enough 

rjv:=maxE[rX,,]<C77V''+i (3.7) 

y&N 



Proof: In view of the Kolmogorov forward equations it suffices to consider the case y ^ Mn- 
We set T, = M-N^y, where y ^ Mn- Then vo{x) defined in (3.2) solves the Dirichlet problem 



(3.8) 



{1 - Pn)vo{x) = (x^S) 
voiy) = 1, vo{x) = (x G Mn) 

Moreover, (3.4) with u = and x = y reads (since ^,(0) = P[r^ < r^]) 

1 - P [r| < ry] = PN{y, z)vo{z) (3.9) 
zeTN 

which can be written as 

{l-PN)vo{y)=nrli,<r^] (3-10) 

We shall use vo{x) as a fundamental solution for 1 — Pjv and, using the symmetry of Pn in 
^^(rjv,Qjv), we get 

QNiyMrli, < r^MrliJ = ((1 " Pn)vo,wo)q 

= {vo, (1 - Pn)wo)q ^3_j^^ 
= Qiv(y) + Yl '^n{x)F[t^ < rX,,], 

where in the last step we have used equation (3.6) and the fact that y ^ Mn- This gives 
the crucial formula for the expected hitting time in terms of the invariant measure Qjv and 
transition probabilities, namely 

^r,. .^ QN{x)nrS<rX,J 1 

HTmJ - Q^(y) p[^_^^ < + p[^_^^ < • (3.12) 

We remark that in this sum only those values of x with Qn{x) > Qjv(y) can give a large 
contribution. To estimate the probabilities in equation (3.12) we choose, given the starting 
point y ^ Mn, an appropriate minimum z E Mn near y such that there is a path 7 : y^z 
(of moderate cardinality) so that Fn attains its maximum on 7 at y (note that such a z exists 
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trivially always). Then the variational principle in equation (2.12) (with 7 as the subgraph 
A) gives 

nrli^ < rl\ > Pirl < r,^] > P.irl < r,^], (3.13) 

where the first inequality is a trivial consequence of z e Mn- But then Proposition 2.5 can 
be applied to get the lower bound 

P[tX,, < ry] > CN-y^ (3.14) 

for some constant C. 

To estimate the other probability in (3.12) we use Corollary 1.9 to write, for a; S, 



Since TWat C S, we obtain from (3.14) that for x ^ S, 

P[ri < r,-] > ¥[tU^ < Tl\ > CN-'/' (3.16) 
Reversibility then gives the upper bound 

< .X<„uJ ^ < < (1. 5fM) . (3,.) 

Thus, inserting (3.16) and (3.17) into (3.15) we obtain from the representation (3.12) that 

E[rX, J < CN{1 + J2^)< CN"^'- (3-18) 

for some constant C. This proves the lemma. 

Next wc need an estimate on the Laplace transform Gy ^l^)- This will be obtained from an 
integral representation of our auxiliary function Vu{x), choosing u smaller than the estimate 
on the inverse of the maximal expected time T/v obtained in Lemma 3.1. More precisely, we 
shall prove 

Lemma 3.2: Assume that A^jv C S C Tn- Then there is a constant c > such that for 
all u < cN~'^~^ and all x,y E Tn, 

G^^iu) < 2 (3.19) 
Furthermore, there are constants b,c > such that for all u < cN~^~^/'^ and y G Tn\M.n, 

l-Gl^^{u)>bN-'/' (3.20) 
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Proof: As mentioned in the beginning of this chapter we can assume without loss of general- 
ity that y G dH. Then it follows from equation (3.5) that the function Wu{x) := Vu{x) — Vo{x) 
solves the Dirichlet problem 



(3.21) 



(1 - PN)wuix) = (1 - PN)vuix) = (1 - e-'')vu{x), {x i S) 
Wuix) =0 (x G S) 

The relation between resolvent and semi-group gives the following representation for x ^ S 



t=0 



Wu{x) = E 
that in turn yields the integral equation 

v^{x) = = t|] + (1 - e-")E 



fix) := (1 - e-^vuix) 



(3.22) 



(3.23) 



for the function v^- We can now use our a priori bounds from Lemma 3.1 on the expectation 
of the stopping time to extract an upper bound for the sup-norm of this function. Namely, 
setting M{u) := sup^.^^^ Vu{x) we obtain the estimate 

M{u) < 1 + |1 - e-"| max E[r|]M(u) < 1 + ]-M{u), (3.24) 
xeTjv 3 

where we have used that |n| < cN~'^~^ with c sufficiently small. This gives for x ^T,, 



G^^iu) < 3/2 



(3.25) 



The estimate of the Laplace transform Gy trivial for negative u or for x G S\(9S. In 

the case x G (3.19) follows from (3.4), using (3.25). 

To prove the estimate (3.20) on the Laplace transform Gy of the recurrence time to the 
boundary point y G dT,, (in particular y G T,\Mn under our assumptions), observe that 
for any S > 0, there exists c > such that for |u| < cN~'^~^/^, using Lemma 3.2 to estimate 
E[rg] < E[t^^] from above, it follows that 



E 



E (1 - 



(3.26) 
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Inserting this estimate and the a priori bound (3.19) into (3.23) together with the a priori 
bound gives then that 

Gl^iu) < F[t^ = rg] + 2SN-'/^ (3.27) 

Inserting (3.27) into (3.4), which represents j^{u) via Gy ^^(u) for x ^ S, it follows that 
modulo 5N~^/'^ one has, for \u\ < cN~'^~^/'^, 

1 - Gl^iu) > 1 - e^P^iy, y) - e" 5] PNiy, xMt^ = rg] 

= 1 - P[r| = r^] (3-28) 
= P[r|<r,^]. 

Since AIat C S and y G Y,\Mn one obtains from (3.16) that 

1 - Gl^iu) > F[r%^ < ry] - > hN-^^ (3.29) 

for some 6 > 0, choosing 5 sufficiently small in equation (3.26). This proves Lemma 3.2. <0> 

We are now ready to give the 

Proof of Theorem 1.10: Note that when Ff^{x) = Fj^ (z* {x , y)) , Lemma 3.2 already 
provides the desired (actually a sharper) estimate. It remains to consider the case z*{x,y) ^ 

X. 

Here we can, as in the proof of Theorem 1.11 in Section 2, construct a discrete separating 
hyper-surface Sn containing the minimal saddle z*{x,y) and separating y and x. Since the 
process starting at x must hit Sn before hitting y, path splitting at <Sjv gives 

g^iu) = Gl^Wy{u), n = MN^ Sn. (3.30) 

We treat the cases a; G A^^r and x ^ Mn' separately. In the latter case we need an additional 
renewal argument, while in the former all loops are suppressed since the process is killed 
upon arrival at x G Mn- For x ^ Mn the renewal equation (1.27) reads 

Glniu) = (1 - G^,n W)-'G^,f^u.M (3-31) 
By Lemma 3.2 and reversibility we have 



(3.32) 
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using Lemma 3.2. Combining (3.32) and (3.20) of Lemma 3.2 we get from the renewal 
equation (3.31) 

G^,^(^) < CivV^Mfl, {zeS^,u<cN-'^-'/^) (3.33) 

ViN{X) 

for c > sufficiently small. 

If x G A^AT, we directly apply the reversibility argument to G^ q{u) (without renewal) and 
obtain a sharper estimate, i.e. (3.33) with AT^/^ deleted on the right hand side. 

Inserting (3.33) into (3.30) and using (3.19) to estimate the Laplace transform gy{u) = 
Gy j^^{u) we finally get, for u < cN~^~^/'^, 

g^{u) < CN^^^QNix)-^ Yl 'Q^(^) = 0(iV'^/2-)g-iv(F^(z*(x,y))-F^(x))^ (3 34) 

where the last equality is obtained by a standard gaussian approximation as (2.38). All 
estimates on the derivatives A; > 1 now follow from Cauchy's inequality and the obvious 
extension of our estimates to complex values of u. This completes the proof of Theorem 
1.10.00 



4. Valleys, trees and graphs 

In this chapter we provide the setup for the inductive treatment of the global problem. 
Although this description is not particularly original, and is essentially equivalent to the 
approach of Preidlin and Wentzell [WF], we give a self-contained exposition of our version 
that we find particularly suitable for the specific problem at hand. To keep the description 
as simple as possible, we make the assumption that Fn is "generic" in the sense that no 
accidental symmetries or other "unusual" structures occur. This will be made more precise 
below. For the case of a random system, this appears a natural assumption. 

4.1. The valley structure and its tree-representation 

We recall from Section 1 the definition (1.3) of essential saddle points. Under our general 
assumptions (Gl), any esssential saddle has the property that the connected (according to 
the graph structure on Fjv) component of the level set A^ = {x G Fjv : Fn{x) < Fn{z)} that 
contains z falls into two disconnected components when z is removed from it. 

These two components are called "valleys" and denoted by V"^ (z) , with the understanding 
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that 




(4.1) 



holds. We denote by Sn the set of all essential saddle points. 



With any valley we associated two characteristics: its "height" , 



h{V\z)) = Fn{z) 



(4.2) 



and its "depth' 



d{V\z)) = Fn{z) - inf Fjv(x) 



(4.3) 



The essential topological structure of the landscape Fn is encoded in a tree structure that 
we now define on the set Mn ^ £n- To construct this, we define, for any essential saddle 
z G £n, the two points 



(note that necessarily the set Mn n {z) consists of a single point if £n n (z) = 0). Now 
draw a link from any essential saddle to the two points zf. This produces a connected tree, 
TJv, with vertex set £n Li Mn having the property that all the vertices with coordination 
number 1 (endpoints) correspond to local minima, while all other vertices are essential saddle 
points. An alternative equivalent way to construct this tree is by starting from below: Form 
each local minimum, draw a link to the lowest essential saddle connecting it to other minima. 
Then from each saddle point that was reached before, draw a line to the lowest saddle above 
it that connects it to further minima. Continue until exhaustion. We see that under our 
assumption of non-degeneracy, both procedures give a unique answer. (But note that in a 
random system the answer can depend on the value of N\) 

The tree TJv induces a natural hierarchical distance between two points in S^UMni given 
by the length of the shortest path on T/v needed to join them. We will also call the "level" 
of a vertex its distance to the root, zq. 

The properties of the long-time behaviour of the process will be mainly read-off from 
the structure of the tree TJv and the values of Fn on the vertices of Tn- However, this 
information will not be quite sufficient. In fact, we will see that the information encoded 
in the tree contains all information on the time-scales of "exits" from valleys; what is still 
missing is how the process descends into a neighboring valley after such an exit. It turns out 




argmax^.g£^ny±(z) FN{zi), if £n n V^{z) + 



(4.4) 
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that all wc need to know in addition is which minimum the process visits first after crossing 
a saddle point. This point deserves some discussion. First, we note that the techniques we 
have employed so far in this paper are insufficient to answer such a question. Second, it 
is clear that without further assumptions, there will not be a deterministic answer to this 
question; that is, in general it is possible that the process has the option to visit various 
minima first with certain probabilities. If this situation occurs, one should compute these 
probabilities; this appears, however, an exceedingly difficult task that is beyond the scope 
of the present paper. We will therefore restrict our attention to the situation where F/v is 
such that there is always one minimum that is visited first with overwhelming probability. 
To analyse this problem, we need to discuss an issue that we have so far avoided, that 
of sample path large deviations for the (relatively) short time behaviour of our processes. 
A detailed treatment of this problem is given in [BG2] and, as this issues concerns the 
present paper only marginally, we will refer the interested reader to that paper and keep the 
discussion here to a minimum. What we will need here is that for "short" times, i.e. for 
times t = TN, T < oo, the process starting at any point xq at time will remain (arbitrarily) 
close (on the macroscopic scale) to certain deterministic trajectories x{t, xq) with probability 
exponentially close to one^^. These trajectories are solutions of certain differential equations 
involving the function F. In the continuum approximation they are just the gradient flow 
of F, i.e. ^x{t) = —VF{x{t)), a;(0) = Xq, and while the equations are more complicated 
in the discrete case they are essentially of similar nature. In particular, all critical points 
are always fixpoints. We will assume that the probability to reach a (5-neighborhood of the 
boundary of A in finite time T will be exponentially small for all fixed T. We will assume 
further that at each essential saddle the deterministic paths starting in a neighborhood of z 
lead into uniquely specified minima within the two valleys connected through z. As we will 
see, these paths will determine the behaviour of the process. 

We will incorporate these information in our graphical representation by decorating the 
tree by adding two yellow-*^^ arrows pointing from each essential saddle to the minima in each 
of the branches of the tree emanating from it into which the deterministic paths lead. (These 
branches are essentially obtained by following the gradient flow from the saddle into the next 
minimum on both sides.) We denote the tree decorated with the yellow arrows by T/v- 

4.2. Construction of the transition process 

^® Convergence of this type of processes to deterministic trajectories was first proved on the level of the 
law of large numbers by Kurtz [Ku] . 

^''This color was used in the original drawing on a blackboard in the office of V. G. in the CPT, Marseille, 
and is retained here for historical reasons. 
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We are in principle interested in questions like "how long does the process take to get from 
one minimum to another?". This question is more subtle than one might think. A related 
question, that should precede the previous one, is actually "how does the process get from 
one minimum to another one?" , and we will first make this question precise and provide an 
answer. 

We recall that in (1.19) we have given a representation of the process going from y to x 
in terms of a random walk on the minima. As we pointed out there, this representation was 
not extremely useful. We will now show that it is possible to give another decomposition of 
the process that is much more useful. 

Let us consider the event J^{x,y) = {t^ < 00} with x,y £ Mn- Of course this event has 
probability one. We now describe an algorithm that will allow to decompose this event, up 
to a set of exponentially small measure, into a sequence of "elementary" transitions of the 
form 

nx^.z,,Xi+^) = [t^I^^ < T^c^^^nA^^} (4-5) 

where .Tj, ,Tj+i € Mn, Zi is the first common ancestor of Xi and Xj+i in the tree Tv, and 
Tz^^xi is the branch of T/v emanating from Zi that contains Xi, and T^. ^. = Tp}\Tz-^xi ■ We will 
write Tz for the union of all branches emanating from z. The motivation for this definition 
is contained in the following 

Proposition 4.1: Let x,y e n Mn, and y G T^^ n Mn- Then there is a constant 
C < 00 such that 

P < r,^l < inf C'e-^[-^'^(^)--^'^(^')] (4.6) 

Remark: Note that by construction we have Fn{z) — Fn[z') > for all lower saddles in 
the branch Tz^y. Thus the proposition asserts that with enormous probability, the process 
starting from any minimum in a given valley visits all other minima in that same valley 
before visiting any minimum outside of this valley. As a matter of fact, the same also holds 
for general points. Thus what the proposition says is that up to the first exit from a valley, 
the process restricted to this valley behaves like an ergodic one. 

Proof: We use Corollary 1.9 with / consisting of a single point. This gives 

Using the upper and lower bounds from Theorem 1.11 for the numerator and denominator. 
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resp., we get 

^ [^1 < ^y] ^ Ce-^[-^^(")-^^("'^] (4.8) 
where z' is the lowest saddle connecting x and y. (4.8) yields the proposition.^ 

Proposition 4.1 implies in particular that the process will visit the lowest minimum in a 
given valley before exiting from it, with enormous probability. This holds true on any level 
of the hierarchy of valleys. These visits at the lowest minima thus serve as a convenient 
breakpoint to organize any transition into elementary steps that start at a lowest minimum 
of a given valley and exit just into the next hierarchy. This leads to the following definition. 

Definition 4.2: A transition J^{x,z,y) is called admissible, if 

i) X is the deepest minimum in the branch T^^x, i-^- -fjv(aj) = infj^'gr^ ^ Fn{x)- 

ii) z and y are connected by a yellow arrow in Tn- 

Remark: We already understand why an admissible transition should start at deepest 
minimum: if it would not, we would know that the process would first go there, and we could 
decompose it into a first transition to this lowest minimum, and then an admissible transition 
to y. What we do not see yet, is where the condition on the endpoint (the yellow arrow) 
comes from. The point here is that upon exiting the branch Tz^x, the process has to arrive 
somewhere in the other branch emanating from z. We will show later that with exponentially 
large probability this is the first minimum which the deterministic path staring from z leads 
to. 

Proposition 4.3: If J^{x,z,y) is an admissible transition, then there exists Kn > 0, 
satisfying N^~°'Kjq | oo such that 

P [jr(x, z, y)]>l- e"^-^^ (4.9) 

Remark: To proof this proposition, we will use the large deviation estimates that require 
the stronger regularity assumptions -R2, i?4, as well as the structural assumptions discussed 
in the beginning of this section. These are to some extent technical and clearly not necessary. 
Alternatively, one can replace these by the assumption that Proposition 4.3 holds, i.e. for 
any z ^ £ and x G T^^x there is a unique y G T^^^ U A^jv such that (4.9) holds. 

Proof: The proof is based on the fact that the process will, with probability one, hit the 
set T^^^ eventually. Thus, if we show that given x and z, for all y G T^^^ with y ^ y. 
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is exponentially small, the proposition follows. To simplify the notation, let 
us set / = T^^ n Mn- Note that the case y ^ % is already covered by Proposition 4.1, so 
we assume that y eTz. Using Corollary 1.9 



'^1 ^ ^I\yUx 



By reversibility. 



"^y ^ '''l\yUx 



[rf < T-\ 



,N\FN{x)-FN{y)h 



(4.10) 



(4.11) 



Now construct the separating hyper-surface Sn passing through z as in the proof of Theorem 
1.11. Then 



z'eSN 



< Tr 



(4.12) 



Putting all things together, and using reversibility once more, we see that 



< Tl\y 



' y 



-N[Fn{z')-Fn{x)]^ 



^ ^z 



USt, 



rl <ri 



(4.13) 



Using that P[rf < r^] > P[r^/ < r^] for any y' G /, together with the lower bound of Theorem 
1.11 and the trivial bound 



7| < Tf\y 



< C"^Ar-(''-2)/2 ^-N[FNiz')-FN(z)]p \z' ^ ^z'^^^ 

z'eSN 

< C-'^N-id-2)/2 ^_jv[FAr(z')-Fiv(z)]p 



z ^ z 



(4.14) 



_^ j^d/2+l^-NKN 



Under our assumptions the condition Fpj{z') — Ff^{z) < Kj^ implies that \z' — z\ < C'^/Kj^, 
i.e. all depends on the term P r| < t/^^^ for z' very close to the saddle point z. Now, 
hcuristically, wc must expect that with large probability the process will first arrive at the 
minimum that is reached from z' by following the 'gradient' of -F/v- 

Let us now show that this is the case. Let us first remark that using the same arguments 
as in the proof of Proposition 4.1, it is clear that the probability that the process will hit the 
set where Fn{x') > FN{z*{x,y)) + S' , 5' > 0, before reaching y is of order exp{—S'N) so that 
this possibility is negligible. Denote the complement of this set by Lgi. Now consider the 
ball Dg of radius 6 centered at z, where 6 should be large enough such that the intersection 
of Lg' with Sn is well contained in the interior of D5 . The set L5/ D D5 is then separated by 
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into two parts, and we call Cs the part that is on the side of I. According to the previous 
discussion, if the process is to reach /, it has to pass through the surface E = dCs H dDs- 
Finally, let Rs denote the ball of radius 6 centered at y. Note first that 



T~ < T 



(4.15) 



x"€Rs 



The second term is exponentially small by standard reversibility arguments. It remains to 
control the first. 

= ^ P [<; < r£'' 



(4.16) 



< |S| sup ] 
x'es 



Now under the assumptions on F, for all x' G S, the deterministic paths x{t, x') reach R5 in 
finite time T (i.e. in a microscopic time TN) without getting close to y. Therefore, for some 
p > 

(4.17) 



^5 <'^Rs 



< 



sup \Xt — x{t, x')\ > p\Xo = x' 

te[0,NT] 



But the large deviation theorem of [BG2] implies that there exists e = e{p,T) > 0, such that 



lim sup — In 1 



sup \Xt — x{t, x')\ > p\Xo = x' 

te[0,NT] 



<-eip,T) 



(4.18) 



so that, e.g., for all large enough A'', 



sup \Xt — x{t,x')\ > p\Xo = x' 
te[o,NT] 



< g-JVe(p,T)/2 



(4.19) 



Tt then suffices to observe that 



< r 



i\y' 



p [rf < 00] = 1 



and so, since 



^1 < ^A. 



(4.20) 

> l-exp(-Ari^jv).0 



< eyip{-NKN), for ah y ^y, 
Note that the above argument also shows that if J^{x, z, y) is admissible, and y' G /, then 

P [t^> < rfu.] < F [r^ < r/ux] ^"^^^ (4.21) 
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Theorem 4.4: Let x.y G Mn- Then there is a unique sequence of admissible events 
J^{xi, Zi,Xi-\-i), i = 1, . . . , k, such that^^ 



{t^ <Oo}D {t^ = Y^ T^^^ < Oo} n Pi J^{Xi,Zi, Xi+i) 



(4.22) 



i=l 



1=1 



and such that the sequences are free of cycles, i.e. the points Xi,i = are all distinct. 

Moreover, there is a strictly positive constant Kn, such that 



1=1 



1=1 



> 1-e 



-NKn 



(4.23) 



Proof: There is a simple algorithm that allows to construct the sequence of admissible 
transitions. Let z be the first common ancestor of x and y in Tj\. First we notice that 
we will 'never' (that is to say with exponentially small probability) visit a minimum that 
is not contained in the two branches emanating from z before visiting all of %. Given this 
restriction, starting from x, we make the maximal admissible transition, i.e. one traverses 
the highest possible saddle for which the starting point is a lowest minimum of its branch. 
This leads to some point X2, from which we continue as before, with the restriction that the 
first common ancestor of X2 and y now determines the maximal allowed transition. This 
process is continued until an admissible transition reaches y. It is clear that this algorithm 
determines a sequence of admissible transitions. We have to show that this is the only one 
containing no loops. 

Note first that the condition that no transition leaves the branches of the youngest common 
ancestor follows since Proposition 4.3 ensures that the target point is reached before exit from 
this valley with probability close to one. It is easy to see that we should always choose the 
maximal admissible transition. Suppose we start in some point that is the deepest minimum 
in some valley that does not contain the target point, and we perform an admissible transition 
that does not exit from this valley. Then we must return to this point at least once more before 
reaching the target which means that our sequence of admissible transitions contains a loop. 
Therefore, at each step the choice of the next admissible transition is uniquely determined. 

Finally, from Proposition 4.3 the estimate (4.23) follows immediately. <C> 

^^We hope the notation used here is self-explanatory: E.g. {r^ < 00} stands for Ut<oo{-X^o = x,Xi / 
y, . . .,Xt-i i^y,Xt = J/}. 
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Remark: Wc sec that the same type of reasoning would also allow us to deal with degener- 
ate situations where e.g. integral curves of the gradient bifurcate and transitions to several 
points y may have non-vanishing probabilities. The picture of the deterministic sequence of 
admissible transitions should then be replaced by a (cycle free) random process of admissi- 
ble transitions. The precise computation of the corresponding probabilities would however 
require more refined estimates than those presented here (except if this can be done by using 
exact symmetries). 

Remark: Theorem 4.4 asserts that for fixed large N a transition occurs along an essentially 
deterministic sequence of admissible transitions. When dealing with the dynamics of system 
with quenched disorder, this deterministic (with respect to the Markov chain) sequence will 
however depend on the realization of the quenched disorder, and on the volume N. In a typical 
situation, this will give rise to a manifestation of dynamical "chaotic size dependence" (in 
the spirit of Newman and Stein (see e.g. [NS] for an overview). 

In the sequel we will always be interested in computing the times (expected or distribution) 
of transitions conditioned on the canonical chain of admissible transitions constructed in 
Theorem 4.4. We mention that in general, these do not coincide with the unconditional 
transition times. Namely, in general, there can occur unlikely excursions (into deeper valleys) 
that take extremely long times so that they dominate e.g. the expected transition times. 
Physically, this is clearly not the most interesting quantity. 

5. Transition times of admissible transitions 

Prom the discussion above it is clear that the most basic quantities we need to control to 
describe the long time behaviour of our processes are the times associated with an admissible 
transition. Note that an admissible transition T{x, z, y) can also be considered as a first exit 
from the valley associated with the saddle z and the minimum x. We proceed in three steps, 
considering first the expectations of these times, then the Laplace transforms, and finally the 
probability distributions themselves. 

5.1. Expected times of admissible transitions. 

A first main result is the following theorem. 

Theorem 5.1: Let T{x,z,y) be an admissible transition, and assume that x is a generic 
quadratic minimum. Then there exist finite positive constants c, C such that, for N large 
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enough, 

E [T^\T{x,z,y)] > cA^e^[^^(^)-^'v(x)l ^^'^^ 
where Kn satisfies N^~"Kn T for some a > 0. 

Remark: In dimension d = 1 the upper bound captures the true behaviour (see e.g. [vK] 
where the expected transition time in c? = 1 is computed in the continuous case. Note that 
the extra factor N in our estimates is just a trivial scahng factor between the microscopic 
discrete time and the appropriate macroscopic time scale). We expect that the upper bound 
has the correct behaviour in all dimensions. 

Before proving the theorem, we will prove some more crude but more general estimates. 
For this we introduce some notation. Let I C Mn- We define 

di{x,y)= inf [Fn{z*{x,x'))-Fn{x)] (5.2) 

to be the effective depth of a valley associated with the minimum x with exclusion at the 
set /. Recall that z*{x,y) denotes the lowest saddle connecting x and y, as defined in (1.3). 
Note that Theorem 1.11 implies that 

(j^id-2)/2^-Ndj(x,y) < p ^^x^^ < ^xj < ^ -^^^]^{d-2)/2 ^-Ndj{x,y) ^5 3) 

With these notations we will show the following 

Lemma 5.2: Let I C Mn, o,nd P [r^ < rf ] > (This can of course only fail in a one- 
dimenssional situation). There exist C < oo such that for any x,y E Mn, we have that for 
all N large enough, 

E [r^|r^ < Tf] < CN'^+^ + CN'^+^ sup (p [t|, < r^|r^ < rf] e^<^^(^''S')) (5.4) 

If the set M.n\{I U y} is empty, we use the convention that the sup takes the value one. 
Remark: Note that 



^[r^x'<r^K<r^] = 



Vl, < < Tf] _ ^ Vl' < r^ui] ^ [^y < ^f] ^ P [t- < 



^ i^y < ^i] ^ [^y < ^i] " P [r^ < rf] 

(5.5) 

and thus, using the same arguments as in the proof of Proposition 4.1, whenever P [r^ < rf ] 
is close to one, we have the more explicit bound 

P < T^|t^ < Tf] < Cmin (e-mFN(^'(x,x'))-F^iz*(x,y))]^-^^ (^5_g) 
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This will be important in the appliction of this Lemma to the proof of Theorem 5.1. 

Proof: The starting point of the proof of Lemma 5.2 is the observation that it holds for 
I = M-N- We formulate this as a distinct lemma. 

Lemma 5.3: Assume that P [r^ < r^^] > 0. Then there exists a constant C < oo such 
that for all x,y & Mn , o,nd all N large enough, 

E [t^\t^ < rX,,] < CN''+^ (5.7) 



Proof: The first important observation is then that P [r^ < t^^~\ cannot be too small, i.e. 
Lemma 5.4: For any x,y E M.n, there exists L < oo, such that for all N large enough 

P [r^ < rX. J > e-^^ (5.8) 



Proof: Now fix any T > Clearly 

P [r; < rX,,] > P [X[TN] = y, VO < t > [NT],Xt Mn\Xo = x] (5.9) 

So all we have to show is that the finite-time probability in (5.9) is larger than exp{—LN) 
for some constant L < oo. But this is obvious by just fixing a trajectory consisting of [NT] 
steps and leading from x to y without visiting the set Mn on the way (making sure that T is 
chosen large enough to allow such a trajectory) and observing that the probability that the 
process is doing just this is at least c^^ , with c is the constant from assumption (R3). <C> 

Next we will use the fact that in Lemma 3.2 we have shown that gy{u) < 2 for u < Ar~<^~^. 
Now for all T < oo, 



E 



riyi/,-a!<T-x 1 



< TP [r-y < rX, J + ^ E [r,- <,+i} 

i=T 
oo 

< TP [r^ < rX, J + + 1) ill ^""^ 

i=T 

oo 

< TP [r-y < rX, J + + 1)26"^^"^'"^^^ 

i=T 

< TP [t^ < rX,,] + Ce-^''~'"^"'^N''+\T + iV'^+^) 
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which for N large enough is bounded by CN'^^^ for some constant C, as desired. ()■ 

This gives us a starting point to prove the lemma by downward induction over the size 
of the set I. Actually, the structure of the induction is a bit more complicated. Wc have to 
distinguish the cases when the starting point x is contained in the exclusion set / and when 
it is not. We will then proceed in two steps: 

(i) Show that if (5.4) holds for all J C M.n with cardinality |J| = k and all x,y e Mn, and 
if (5.4) holds for all J of cardinality | J| = A; — 1 for all y G A^jv and x ^ JUy, then (5.4) 
holds for all / with cardinality |/| = A; — 1 and all x,y e Mn- 

(ii) Show that if (5.4) holds for all J with cardinality |J| = k and all x,y G Mn, then (5.4) 
holds for all J of cardinality \J\ = k — 1 for all y G A^at and x ^ JU y. 

If we can establish both steps, we can conclude that since (5.4) holds for / = jv and all 
x,y e Mn, it holds for all I C Mn- 

We now proof both assertions. Note that C will denote in the course of the proof a generic 
finite numerical constant. We will not keep track of the changes of its value in the course of 
the induction. We will set k = d + 3. 

Step (i): We need only to consider sets J of cardinality k — 1 with x G JUy. We can assume 
without loss that y ^ J- 



E 



E 



= E 



y -Mpf 



+ E ^ 

x'eMN\J\{x,y} 

+ E 

x'eMN\J\{x,y} \ 



(5.10) 
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Dividing by P [r^ < Tj] , we get from (5.10) that 



+ E I 

x'eMN\J\{x,y} 



+ E 



Fx' < ^x. 



(5.11) 



The first summand in the last hne in (5.10) produces exactly the first term in (5.4). For the 
second, observe that 



Mi 



T,? < r 7 



< 



JUy\ 



P [r- < rj] 



(5.12) 



which makes the entire term smaller than CN'^. For the last term we may use the induction 
hypothesis for the conditional expectation time appearing in it to get 



E 



Ty \Ty < Tj 



P [r^, 



< r 



Mi 



Vy < 



< CN^ + CN" sup 

x"eMN\{J^y} 

= CN" + CN" sup 



P [t- < T]] 



But the last factor satisfies 



J±^Ndj{x",y)_ 



r5, < rif < r , 



(5.13) 



P [Tx' < 



Ml 



Vx" < < r^] 



< 



< 



Tp, <Ty KT^j 



P Vx' < 



JUyUx"! 



P [t-, < T- < T^] 



(5.14) 



So that (5.13) actually gives a term of the desired form. This proves the first inductive step. 

To complete the proof we need to turn to 
Step (ii): Here we must consider J such that x ^ JUy. 
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We will first consider the sub-case when x is such that 



dj{x,y)= sup dj{x',y) 
x'eMN\{J'Jy} 



(5.15) 



Note that in this situation 

sup P [r^, < t^\t^ < t]] e^'^^(^''2') > 

x'eMN\{JUy} 

But 



y ^ 'JJ „Ndj(x,y) 



so that in this case it will be enough to prove that 
Recall from Corollary 1.9 that 



(5.16) 



(5.17) 



(5.18) 



T 7, u, < T,^ 



Vl < T]^y] (5.19) 



' JUy ^ 'x 

By the induction hypothesis, the first term in (5.19) satisfies the bound 

x'eMjv\{JUyUx} 

(5.20) 

as desired (note that dj\jx{x' , y) < dj{x', y) by definition and dj{x' , y) < dj{x, y) by assump- 
tion (5.15)). For the second term, we use again the induction hypothesis to get that 



' JUy ^ 'x 



< CN'' sup 

x'eMN\{JUyUx} 



{tI, <t^<t 



JUyJ ^Ndjuy(x',x) 



X ^ X 

'JUy ^ 'x 



+ 



X ^ X 

'JUy ^ 'x 



< C~^N~^'^~'^^^^e^'^-'^^''''^CN'^ sup f F[t^, < T-^'^Q^djuy{x',x) j _|_ 

x'eMN\{JUyUx} \ J 

^ (j-l^^Ndj{x,y)(jjyK g^^p I g-N{FN{z''{x,x'))-FN{x))^Ndjuy{x',x)\ 

x'eMN\{JUyUx} y J 



'JUy ^ 'x 



(5.21) 
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It remains to show that 



^-N{FN(z*(x,x'))-FN{x))^Ndjuy{x\x) 

= p-JV(Fiv(2*(x,x'))-Fjv(x)) inf^„gjuj;u:. N{Fn{z'{x",x'))-F{x')) 



(5.22) 



is bounded by one. We consider two cases: 

(i) Assume that 

inf Fn{z*{x",x')) - F{x') = Fn{z*{x,x')) - Fn{x') (5.23) 

x" ^JUyUx 

Define z*{x,A) by Fn{z*{x,A)) = infx'eAFN{z*{x,x')). Then in this case, x is 'closer' 
to x' then to J U y, so that 

z*{x,JUy) = z*{x',JUy) (5.24) 

Thus by assumption (5.15) 

inf Fn{z*{x",x')) - F{x') = FN{z*{x,JUy))-FN{x') < Fn{z* {x, J U y)) - F{x) 

(5.25) 

This impUes that Fn{x) < Fn{x'), and since 

g-Ar(FAr(2*(x,x'))-Fjv(x))ginf^»gjuyu.-^(^'jv(2*(x",x'))-F(x')) < g- Ar(Fjv (x')-Fjv (x)) (■5 26) 

the sup is bounded by 1 as desired. 

(ii) We are left with the case 

inf Fn{z*{x",x')) - F{x') < Fn{z*{x,x')) - Fn{x') (5.27) 

x"e JUyUx 

Here x' is closer to J U y then to x, and so 

z*{x,JUy) = z*{x,x') (5.28) 

But then, again by (5.15) 

Fn{z*{x,x') - Fn{x) = FN{z*{x,JUy)) - Fm{x) > inf Fn{z\x' ,x")) - Fm{x') 

x"G JUyUx 

(5.29) 

which has the desired implication. This covers the case when (5.15) holds. 
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Let us now turn to the case when (5.15) does not hold. Let x* G A^jv be such that 

dj{x*,y)= sup dj{x',y) (5.30) 

x'eMjv\{^Uy} 

By assumption x* ^ x. We can write 

E 



^[ry\ry<r]] 



Now 



iI)=E[r^\T^<T]^^,] 



P [ry < r5u.>] 

P [t- < T-j] 



= (/) + {II) (5-31) 



(5.32) 



and using the induction hypothesis, 



(/) < CN" + CAT'' 



sup p 1 

e7Wjv\{JuyUx*} P [t^ < rj\ 



JUx*J Nd.,u.*(x',y) 



< CN^ + CN" sup 



1"^ r 1 

as desired. On the other hand, 



,Ndj{x',y) 



E 



E 



+ 



IT . 



l^y < 



(5.33) 



(5.34) 



= (I/a) + (//6) 
To treat (Ila) we can use the induction hypothesis to get 
{I la) = E [r|.|r|. < r^u,] P < r.^r,^ < rj] 

P [r-, < < rJu,] 



< CAr« + CN" sup 

x'ejMiv\{-/Ux*U2/} 



^Ndjuy{x' ,x*) 



= CN'^ + CN" sup . 

x'eA1iv\{JUx*U2/} P [t^ < Tj 



< CAr« + CA^« sup 

x'eMjv\{JUa;*U2/} 



T,7 < r 



JigJVdj(x',2/) 



since djuy{x' ,x*) 



inf^/'gjuyux-[-^Ar(2*(a:',a:")) - /^(a;')] < dj{x',y), and finally 



(776) = E 

= E 



JUyl 



P [r- < rj] 



^y K < ^.J 



(5.35) 



(5.36) 
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where the last hne is obtained by using that the conditional expectation in the one-but-last 
line the conditional expectation is of the form considered just before. Putting all together we 
see that the lemma is proven.<0> 

Proof of Theorem 5.1: Lemma 5.2 can now be used to prove the theorem. For this, let 
J^{x, z, y) be an admissible transition and fix / = T^^\y D Mn- 

We have already seen in (5.17) that P [r^ < T/uy] differs from 1 only by an exponentially 
small term. Moreover, using (4.21), we see that in the case of an admissible transition, 



P [r^ < T^] <P [rfu, < r^] < P [r^ < r^] + ^ : 

y'ei 

-NKm\ 



xUli 



Therefore (5.19) implies in this case that 



^[ry\ry<rl 



(1 + C>(e-^^^)) + E[ 



(5.37) 



(5.38) 



P [t^ < T^] 

Now we have already precise bounds on the denominator of the first term (see Section 2), 
and using the upper bound from Lemma 5.2 (taking into account that we are now in the 
situation of the remark following that Lemma!) we see that, under the assumptions of the 
theorem, the second term is by a factor iV* exp(— iCjv-^) smaller than the first. It remains 
to estimate precisely the numerator in the first term. The essential idea here is to use the 
ergodic theorem. It may be useful to explain this first in a simpler situation where there 
is only a single minimum present and consider the quantity g'^{u). Let D C Fjv be the 
local valley associated to y, that is the connected component of the level set of the saddle 
point that connects y to the rest of the world. The basic idea is to show that the expected 
recurrence time at y (without visits at other points oi Ai^) is up to exponentially small errors 
equal to the same time of another Markov chain X^it) with state space D with transition 
rates poix, z) defined as in (2.8) and whose invariant measure, is easily seen to be just 
Qjv conditioned on D, i.e. Q,d{x) = Qn{x)/Qn{D) for any x e D. Then, by the ergodic 
theorem, we have that 

1 QNiD) 



- ^ / ^ (5-39) 

Qoiy) Qiv(2/) 

This quantity can be estimated very precisely via sharp large deviation estimates. It will 
typically exhibit a behaviour of the form CTV*/^. 

To arrive at this comparison, we simply divide the paths in our process into those reaching 
the boundary of D and those who don't, i.e. we write 



E 



e^'^y'K^y^^y TL^y ^^y 
^y—^MN dD^^y 



^yl^y 



(5.40) 
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Let us denote by D'^ and D~ the two sets obtained by adding and removing, respectively, 
one layer of points to, resp. from, D. Note that on the event {t|^ > r^} the processes X{t) 
and have the same law until time r^, so that 



E 



' dD^ 'y 



e y 



(5.41) 



We will show that this is the dominant term in (5.40), the first summand on the right being 
exponentially small. Indeed 



E 



tedD 



" - yUdD 



E 



(5.42) 



Using Theorem 1.10, for small enough u, the first factor is bounded by 
const.N'^e~^^^^^^^~^^^''^^^ , while the second is bounded by const. N'^. This gives the desired 
upper bound 



D+ 



(5.43) 



where z* denotes the lowest saddle point in dD. To get the corresponding lower bound, just 
note that 



E 



D+ 



e"^« l^y 



E 



D+ 



■E 



D+ 



e"^« l^y 



' ao — 'y 



But the last term in (5.44) can be treated precisely as in (5.42), so that we arrive at 



(5.44) 



(5.45) 



Differentiating and using reversibility and the upper bounds from Theorem 1.10, as well as 
the obvious lower bound 

5^(0) = l-P[rX,^<r,^] >1- Yl 5^(0)>l-|>l,v|iV<^-V^[^-(^*)-^-(^)l (5.46) 

xeMN\y 



gives in the same way that 

gm Qiv(y) 



+ C)(Ar'*)e"^[^^("^')"-^'^(2/)l 



(5.47) 



The same ideas can now be carried over to the estimation of the return time in an admissible 
situation, using the estimates from Lemma 5.2. 
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Proposition 5.5: Let J^{x,z,y) be an admissible transition. Let D denote the level set of 
the saddle z. Then for Kn > satisfying N^~°'Kn T for some a > 0, 



(5.48) 



where I = T^xXv ^Mn- 



Proof: Basically, the proof goes as outlined above. With D defined as the level set of the 
saddle z, we can decompose 



E 



E 



+ E 



X ' x^' luy ' X ^ ' dD 



(5.49) 



The first summand gives precisely 



E 



^ ' ' luy ' x^' g 



E 



^ ' x^' go 



(5.50) 



so that from this term alone we would get the same estimate as in (5.47). We have to show 
that the second term does not give a relevant contribution. Note that as in (5.42) we can 
split paths at the first visits to dD. This gives 



E 



^ ' x^' luy ' X ' g 



z'edD 



lUy 



(5.51) 



Now P [t^i < Tgjy,T^, < T^] is bounded by e ^l^^iz) Fjv(a;)]^ reversibility 



E 



T^/^-x <:^x ^ X 



< ^-N[Fn{z)-Fn{x)]^ 



, so all we have to show is that the 



two quantities E 



and E 



lUy. 



(which are more or less the same) are 



not too large. But this follows from our previous bounds by splitting the process going from 
z' to X at its first visit to a point in T^^x H Mn, e.g. 



E 



L ^ ^x <^/Uy 



E 



'x' ^T^,<T^' 

x' — Al n 



+ 



'x' — ^Mn 



E 



X ^-T-X' ^ ^X' 

. ^x <^IUy 



(5.52) 



Lemma 5.2 and Theorem 1.10 can now be used on the expectations in (5.52), and this implies 
the desired result. <C> 

Now if (as we assume) x is a quadratic minimum of F^, '^q^^^-^'' = CN'^/^, and using this 
together with Theorem 1.11 we get the estimates of Theorem 5.1.00 
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Remark: The reader will have observed that wc could also prove lower bounds for more 
general transitions, complementing Lemma 5.2. But the point is that these would depend 
in a complicated way on the global specifics of the function F^, contrary to the situation 
of admissible transitions for which we get the very simple estimates of Theorem 5.1. The 
beauty of the construction lies in some sense in the fact that the general "worst case" upper 
bounds of Lemma 5.2 suffice to obtain the precise estimates of the theorem. 



5.2 Laplace transforms of transition times of admissible transitions 

Theorem 5.1 gives precise estimates on the expected transition times for an elementary 
transition. We will now show that as expected, the distribution of these transition times is 
asymptotically exponential. This will be done by controlling the Laplace transforms for small 
arguments. 

Theorem 5.6: Let !F{x,z,y) he an admissible transition. Set fy = E [r^|jr(a;, z, y)] . 
Then 

E [e''^/^^ |JP-(ar,z,y)] = + e'^^"" f{v) (5.53) 

where for any (5 > 0, for N large enough, f is hounded and analytic in the domain |i?e(i')| < 
1 - 5 

Proof: The main ingredient of the proof lies in controlling the analytic structure of the 
Laplace transforms. The procedure will be similar to that in the proof of Lemma 5.2, that is 
we consider the entire family of functions Gy j{u) and establish the corresponding domains 
by induction, starting with the case I = Aijy where the analytic estimates of Theorem 1.10 
hold. It will be convenient to use functions where the argument u has been properly rescaled. 
The naive expectation might be that the Laplace transform will exist for values of u up to 
the inverse of the corresponding expected transition time. However, this is not so. The point 
is that Laplace transforms are much more sensitive to "deep valleys" than the expected times 
for which such valleys contribute less if they are unlikely to be visited. However the Laplace 
transform will only partly benefit from this, but simply explode at a value corresponding to 
the deepest valley that is at all allowed to be visited. 

We introduce some more notation for convenience. Set 

ti{x,y) = e^'^'^'^'^'^ (5.54) 
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and 



Ti{y)= sup ti{x,y) 

x(zMN\{I^y} 



(5.55) 



With the notation of Section 1, we define 



Gli {v/Ti{y)) 
P [r- < rf] 



(5.56) 



The following key lemma gives us control over how this happens. It is the analogue of Lemma 
5.2. 

Lemma 5.7: Let I G Mn, o-nd let x,y G Mn- Then j(y) can he represented in the 
form 

Sim = oii.,j(viT,(y)) 



x'eMN\{I^y} 



where a^y j and y j, for any x' G A^jv\{I U j/} are complex functions that have the prop- 
erties (for a finite constant C , and k = d + 3): 

(i) They are bounded by CN'^ and analytic in the domain \Re{u)\ < CN~'^ , 

(ii) They are real and positive for real v. 

Proof: An important corollary of analyticity are corresponding bounds on the derivatives. 
Namely, by a standard application of the Cauchy integral formula it follows that for any 
function a which is bounded and analytic in the domain |i?e(i;)| < CN~'^,foi \Re{v)\ < ^N~'^ 
we have, 



d^ 

- — a(v) 



< n\C- 



sup a(y) 

v:Re{v)<CN-"- 



(5.58) 



This will be used repeatedly in the sequel. 



The first step in the proof is to show that (5.57) holds M I = M.n- The proof of this 
fact is completely analogous to that of Lemma 5.3 and we will skip the details. Let us just 
mention that we will get the boundedness of Gy j^^ {v) in a domain Re{u) < cN~'^~'^ (while 
analyticity was established in the larger domain Re{u) < cN~'^~^ in section 3). This provides 
the starting point for our induction like in Lemma 5.2. Again we assume that the Lemma 
holds for all I of cardinality greater than or equal to i, and we consider sets J C A^at of 
cardinality £ — 1. As before, we first show that the case x G J reduces easily to the case 
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X ^ J. Without loss of generality we assume y ^ J- Namely, in the former case, 



GlAv) = 



gyiv/Tjjy)) 



+ 



E 



*(»/rj(!;))P 



x'eMN\{JUy} 



■GlAv) (5.59) 



Inserting the induction hypothesis for the Gy j{v), we get 



^ > TU I ^ r,-X I z_-/ 



< '^jJ x"eMN\{Juy} 



<',y,ji^/Tjiy)) 



+ 



E E 

x'eMN\{JUy} x"eMN\{JUy} 
P [tx'" 



9^Av/Tj{y))^[-^^'^-M.]^ 



rif < r f 



P [r- < T^] 



t.j{x",y) 
' Tj{y) 



(5.60) 



Remember that in the proof of Lemma 5.2 ((5.13)-(5.14)) we have established that 



Tif < r r 



P [t- < rj] 



T^„ < Ty \Ty < Tj 



<P[r^>,<r^\T^<r]] (5-61) 



which, since the analytic properties of a^, y j and cu^y j are the same, shows that (5.60) 
provides the claimed representation. 

The more subtle part of the proof concerns the case x J. We proceed again as in the 
proof of Lemma 5.2 and consider first the case when Tj{y) = dj{x,y). Note again that in 
this case, the representation in the Lemma reduces to 



since all the other terms in the sum (5.57) are smaller and more regular. 
We use of course that 



which implies 



Gy^JUxi"^) 



Gx,JUyi'^) 



GlAv) = 



p r_x ^ _x |_x ^ _xl P<x I ^, Tjux(y) 

^ I'y ^ VUxTy ^ ^y,JUx Tj{y) 



1-: 



T-X ^ T-X 

'x ^ JVJy 



G 



( Tjuy{x) \ 

x,JUy Tj{y) J 



y,Jlix y Tj(y) 



) 



1 - ^r^-^v^m^ J, ded'ij^y I uv 



Tj{y) ) 



(5.62) 



(5.63) 



(5.64) 
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The numerator again poses no problem since it permits to obtain the desired representation 
by the inductive hypothesis. Potential danger comes from the denominator. But using the 
induction hypothesis, we see that (similar to (5.20)) 



Tj{y) 



{x) 



x,JVJy 



T 



JUy 



Tj{y) 



■JUy 



Tjiy) 



^ " x,x,JUy 



.x'eMN\{JUxUy} 



ev- 



tjuy{x',x) ^^ tjuy{x',x) _^ ^0 f ^ 1 



Tj{y) 



'-JUyix) 



(5.65) 



E 



x'eMN\{J'JxLly} 



\ X ^ X] ^JUy{^ 1 ^) ;x' 
I'x'^'xi Tj(y) 2;,x,JUy 



tjUy{x',x) 

Tj{y) 



1 



x,x,Jyjyyj,^^y^J Tj(y) 

All we need is to bound the modulus of this expression from above for v real. This gives 



Ov 



TjUyjx) 

Tj{y) 



x'eMN\{JOxUy} 



(5.66) 



Tj{y) Tj{y) 
In the second part of the proof of Lemma 5.2 ((5.21) we have shown that 

Thus we deduce from (5.66) that 



TjUy {X) 



Tj{y) 



x,JUy 



T 



JUy 



X 



Tj{y) 



< 



TAy) 



(5.67) 



(5.68) 



We see thus that the numerator in (5.64) will not vanish for -y < C Thus Gy j{v) is 

bounded and analytic in the strip |-Re(v)| < C~^N~'^. 

In the general case, we proceed as in the proof of Lemma 5.2 and decompose all paths into 
those avoiding the deepest minimum x* and those visiting it. A simple computation yields 
then 

^y,jW -^y,jux'' [V j.^^y^ J P[r-<r5] 



+ F [r^' < r^yK < r^j] G^x',Juy (^^^^^) ^^j (-) 



(5.69) 
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Clearly the first term is by the inductive assumption at least as good as desired, and since 
we have just shown that the last factor in the second term is bounded and analytic in 



|i2e(t;)| < CN using the inductive assumption for 



one sees easily 



^x*,JUy Tj{y) 

that the desired representation holds. This concludes the proof of the Lemma.<^ 

We are now ready to prove Theorem 5.6. For this we have to improve the previous analysis 
in the case where x is the deepest minimum in the allowed set M.n\I- In that case Ti{y) is 
strictly larger than any of the terms tiux{x',y) and tiuy{x',x) and it will pay to use Taylor 
expansions to second order. Also, we will be more precise in the rescaling of the variables u 
and define 

G^iiv) ^ GljivTj{y)/f) (5.70) 

with 



E 



T = 



yUI 



^fuy < ^x 



(5.71) 



Note that we use T instead of Ty in the proof because this will simplify the following formulas. 
But recall from the proof of Theorem 5.1 that 

E[T-\T-<Tf] 



< e 



-NKn 



Thus in the final results they can be interchanged without harm. Then 

^ Fa ^ ^lUxl^y ^ '^y,lLlx y" f J 



Gyjiv) 



1-: 



,-7-JU "T 

'x ^ lyjy 



G: 



x,IUy 



(5.72) 



(5.73) 



Using the analyticity properties established in the preceding lemma, we now proceed to a 
more careful computation, using second order Taylor expansions. This yields 

GyAv) 



■f] (l + ^ 



_|_ (^Tju^Mr Q" 



y,IUx 



(5.74) 



for some < 9 < 1. We use Corollary 1.9 to get 

GyA^) = 



i -I- jtIC. \Ty \Ty < TjuxJ + 2T2 yjux ' 



T 



l + fE[r-|r-<rfu,] + 



-1 V ' ' - IViy- \-'^-^v\~ij -L-x lUy' (~\llx { Q„, '^IUy{x) \ 



2T2 ^ y,lux I '^'^ 



Tiux(y) 



(5.75) 
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The term we must be most concerned with is the second order term in the denominator. 
Here we must use the fuU analyticity properties proven in Lemma 5.7. This gives, after 
computations analogous to those leading to (5.66) and using the obvious lower bound on T 



{vTl^y{x)f 



2TE r-|T- < rfu. 



Tluy{x) 



2E 



' x\' X ^ ' luy 



(5.76) 



^ g-Ar[Fjv(^*(x',x))-F;v(x)+F;v(^*(x,j/))-F;v(x)] (tiuy{x',x)y 



x'€MN\{JUyUx} 

Now according to our hypothesis that x is the lowest minimum in I, it follows that ti[jy{x' , x) <| 
^N[F^{z*{x,x'))-Fr,{x)] ^^^^^ (-g ^g^ g^^^^^jy bounded by 



2 ' ' 



sup e 

x'eMN\{lLlyLlx} 



-Ar[Fjv(z*(x,2/)-Fjv(z*(x',x))] 



(5.77) 



2 ' ' 



where S is strictly positive. 



All the other terms in (5.75) except the leading ones are even smaller. Note moreover 
that both T/uy(x) and Tnjx{y) are exponentially small compared to T, so that all these error 
terms as functions of v are analytic if |-Re(t;)| < l.This allows to write Gy j{v) in the form 



+ e 



-NKn/2 



ei{v) 



+ 



(5.78) 



l-v ' ' l-v {l-v){l-v-e-^^'^l'^eri{v)) 

where all are analytic and uniformly (in N) bounded in the domain |-Re(v)| < 1. This 
concludes the proof of the theorem.<0><0 

5.3. The distribution of transition times. 

Prom Theorem 5.6 one obtains of course some information on the distribution function. 
Corollary 5.8: Under the same assumptions as in Theorem 5.6, we have: 
(i) For any 5 > 0, for sufficiently large N , 

P [r^ > tf\J^{x, z, y)] < e-^^-^^yS (5.79) 



(a) Assume that Ni is a sequence of volumes tending to infinity such that for all i, J^{x, z, y) 
is an admissible transition. Then, for any t >0, 



limPiv, Wy > tTN,\T{x,z,y)] = e" 

ifoo 



(5.80) 
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Proof: (i) is an immediate consequence of the the Laplace transform is bounded for real 
positive V with v < 1 and the exponential Chebyshev inequality, (ii) is a standard consequence 
of the fact that the Laplace transform converges pointwise for any purely imaginary v to that 
of the exponential distribution, and is analytic in a neighborhood of zero.^ 

With a little more work we can also complement the upper bound (5.79) by a lower bound 
on the distribution of the survival time in a valley. 

Proposition 5.9: Let J^{x, z,y) be an admissible transition, and set I = M.n\Tz,x- Let 
h{N) be any sequence tending to zero as N tends to infinity. Then, for some k < oo, for any 
< at < 1 we have that 



'[rf>t]>{ 



' g-v(f (i-aO)c,2 (1 _ ;j(jv)) if t > h{N)j^. 



an (5-81) 



Proof: The proof of this lower bound consists essentially in guessing the strategy the process 
will follow in order to realize the event in question which will be to return a specific number 
of times to x without visiting the set /. For, obviously. 



SI , . . . ,Sti >1 

siH \-S'n>t 

n 

si,---,sn>l i=l 
siH \-Sn>t 

n 

= (F [r^ < rf]r E n ^ = ^ 



(5.82) 



•.Sn>l i = l 
■ + Sn >t 



We introduce the family of independent, identically distributed variables Yi taking values in 
the positive integers such that for ¥[Yi = s] =¥[t^ = s\t^ < rf]. Then (5.82) can be written 
as 



'[rf>t]>{¥[T^<Tf]r^ 



Y.Y,>t 



.i=l 



(5.83) 



We have good control on the first factor in (5.83). We need a lower bound on the second 
probability. The simplest way to proceed is to use the inequality, going back to Paley and 
Zygmund, that asserts that for any random variable X with finite expectation, and any a > 0, 
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F[X > (1 - a)EX] > a^S^. We will use this with X = '£^ Y^. Thus 



i=l 



.i = l 



t 



> 1- 



1 - 



t 



nEYi 

^ n2(Eyi)2 



nEYiJ n (n - 1 ) (EYi ) 2 + nEy 2 



(5.84) 



Now using Lemma 5.7 one verifies easily that 



EY^ < CN^ + Ti{x)e 



-NKn 



(5.85) 



So that (5.84) gives 



,i=l 



> 1 



nEYi J 1 + i (CA^« + Ti{x)e-^^^) 



(5.^ 



Thus the second factor is essentially equal to one if n S> max(iV'*, T/). We now choose n as 
the integer part of n{t) where 



n{t) = min 



1 



EYi{l-at)' h{N)iCN- + Tj) 



(5.87) 



This yields 



' [rf > > (1 - P [rf < <])"(*^ a^ 



l + ^(C7V« + rKx)e-^-^-) 

> g-t^gS-to(F[.F<x:]^)^2 (1 _ 



(5.88) 



if t is such that the n{t) is given by the second term in the minimum in (5.87). This yields 
the first case in (5.86). If t is smaller than that, one sees easily that n(t)P[rf < r^] tends to 
zero uniformly in i, as iV | oo (in fact exponentially fast!) This implies the second case. 



6. Miscellaneous consequences for the processes 

In this section we collect some soft consequences of the preceding analysis. We begin by 
substantiating the claim made in Section 1 that the process spends most of the time in the 
immediate vicinity of the minima. We formulate this in the following form. 
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Proposition 6. 1 : There exists finite positive constants C, k such that for any p > 0, 
X E Mn and t > 0, 



\Xt -x\> p\tX^^\^ >t,Xo=x 



< CN^ inf sup e 

P'<PyeVN-p'<\x-y\<2,p'/2 



-N[FN{y)-FN(x)] 



(6.1) 



Proof: We start to decompose the event {t^^^^. > t} as follows: 

{rM^\x >t}= U i^M^\x > 4 n {Xs = X,ys<s'<tXs' ^ Mn} 
0<s<t 

Then 

P [\Xt -x\> p\tX^^\, >t,Xo = x 



(6.2) 



E 

0<s<t 







= x 


p 


^Mn\x 


> t 





P [\Xt_s -x\> p, rX,^ > t - s\Xo = x] 



p 


'''Mn\x > S,Xs — X 






P 





P [\Xt_, -x\> p,TX,^>t- s\Xo = x] 



inf < 

p'<p 



< inf 

p'<p 



E E 

a:p'<|x-j/|<3p72 0<s<t 

E E 

3/:p'<|x-j/|<3p72 0<s<t 



min(P [r^<rX,,],F[TX,,>t-s]) 



'Mn\x 



> t - S 



Now 



while 



MN\y 



yeMN 



(6.3) 
(6.4) 

(6.5) 



Note that by Theorem 1.10 and the exponential Markov inequality, for some k < oo 

< c-TV^e-^*-")^^"" (6.6) 



Ty >t- S\Ty < Tl^^\y 



while the factors P 



are exponentially small except if y = x. The denominator 
is bounded below by Proposition 5.9. Since it decays with t — s at an exponentially smaller 
rate than the numerator (see (6.6)), and is close to one for times up to the order exp(A^C) 
(for some C), it is completely irrelevant. 
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Thus wc sec that the second term in the minimum takes over for t — ,s > N'^~'~^[Ff^{y) — 
Fn{x)], a number smah compared to the inverse of the first term in the minimum. Thus 
using that, for a, 6 <S 1, 

y min (e-«*, b) < + ^^{\lnb\ + 1) (6.7) 

t=o 

the result follows immediately. O 

Based on this result, we will now show that during an admissible transition the process 
also stays mostly close to its starting point, i.e. the lowest minimum of the valley concerned. 
The following proposition makes this precise. 

Proposition 6.2:Let J^{x,z,x') be an admissible transition. Then there exists finite posi- 
tive constants C,k and s.t. limA^^~"iCjv t oo, for some a > 0, such that for any t and 
p>0, 

P \Xt - x\ > plrf c > t,Xo = X 

< CN^ inf sup e-iV[F.(y)-F.(x)] + ^^-ivi^. (6-8) 

P'<PyerN:p'<\x-y\<3p'/2 



Proof: The proof of this proposition is in principle similar to that of Proposition 6.1. We 
begin by applying the same decomposition as before to get 



^\Xt — x\ > p\Tr^^ > t,XQ = X 

Pfrf, >s,X, = x 

- E - 



0<s<t 



\Xt-s -x\> p, rf-o ux > * - ■^l-'^o = X 



inf < 
p'<p 



< inf 
p'<p 



E E ■ 

y:p'<\x-y\<3p'/2 0<s<t 

E E 

y:p'<\x-y\<3p'/2 0<s<t 



(6.9) 



Tj-a > t — S 



min ^P 




,P 




) 


P 


rf-c > t — s 

z,x 





As in the proof of Proposition 6.1, the denominator is bounded by Proposition 5.9 and is 
seen to be insignificant. We concentrate on the estimates of the numerator. Again we have 
the obvious bound 
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but to deal with the second probabihty in the minimum will be a little more complicated. 
Note first that as in (6.5) we can write 



>t-s 



+ 



(6.11) 



Now the terms in the second sum are all harmless, since by the estimates of Lemma 5.7 and 
the geometry of our setting, using the exponential Chebyshev inequality as if Corollary 5.8, 
for any (5 > 0, 



Ty >t S\Ty < T^C^^^\y 



< ^-it-s)(l-S)/TT-a^u.{y)^-N[FNiz)-FN(x)] 



(6.12) 

with Tq-c ^\jx{y) much smaller than eN[FN(z)-FN{x)\ ^ The remaining term is potentially dan- 
gerous. To deal with this cfRcicntly, wc need to classify the trajectories according to the 
deepest minimum they have visited before returning to x. In the present situation the rele- 
vant effective depth of a minimum y G 7^^^ is (recall (5.2)) 



d{y) = dr-(y,x) = Fn {z* {y , x)) - Fn (y) 



(6.13) 



We will enumerate the minima in Tz^x according to increasing depth hy x = yo, ■ ■ ■ ,yk (we 
assume for simplicity that no degeneracies occur). We set L{y) = {y' G 7^,^: '■ d{y') > d{y)}. 
Then the family of disjoint events {r^ > Ty.} PI {r^ < t^^-^^^^} can serve as a partition of 
unity, i.e. we have that 



j=0 
k 



mm 



mm 



j=0 

Now again 
while 



j=0 



^x ^ ^ViVx ^ ^Tf 



g-7V[F;v(^*(yi,x))-Fjv(x)] 



(6.14) 



(6.15) 
(6.16) 
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which is much smaller than e"''^^^^') (except in the case i = where wc arc back in the 
situation of Proposition 6.1). Combining all these estimates and using again (6.7) yields the 
claim of the proposition. <0> 

Remark: Note that Proposition 6.2 again exhibits the special role played by admissible 
transitions. It justifies the idea that the behaviour of the process during an admissible 
transition can be described, on the time scale of the expected transition time^^, as waiting 
in the immediate vicinity of the starting minimum for an exponential time until jumping 
quasi-instantaneously to the destination point. This idea can also be expressed by passing to 
a measure valued description (as was done in [GOSV]) which will exhibit that the empirical 
measure of the process on any time scale small compared to the expected transition time but 
long compared to the next-smallest transition time within the admissible transition, is close 
to the Dirac mass at the minimum; since this, in turn, is asymptotically the invariant measure 
of the process conditioned to stay in the valley associated to the admissible transition, it can 
thus justly be seen as a metastable state associated with this time scale. The corresponding 
measure-valued process is than close to a jump process on the Dirac measures centered at 
these points. These results can be derived easily from the preceding Propositions, and we 
will not go into the details. 

Let us also mention that from the preceding results and Corollary 5.8 (ii) one can easily 
extract statements concerning "exponential convergence to equilibrium". E.g., one has the 
following. 

Corollary 6.3: Let ] oo be a subsequence such that for all k the topological structure 
of the tree from Section 5 is the same and such that along the subsequence, FjVj. is generic. 
Let niQ denote the lowest minimum of Ff^^^. Let f € C(A, M) be any continuous function 
on the state space. Consider the process starting in some point x G Fjv. Then there is 
a unique minimum m{x) of Fjq^, converging to a minium m{x) of F, such that, setting 



m 
'mo 



lim E/(Xt/^.(fc)) = e-'f{m{x)) + (1 - e-*)/(mo) (6.17) 

fcjoo 



(where on the right hand side m{x),mo denote the corresponding minima of the limiting 
function F). The point m{x) is the lowest minimum of the deepest valley visited by the 
process in the canonical decomposition of the transition x,mQ given in Theorem 4-4- 



^^A finer resolution will of course exhibit rare and rapid excursions to other minima during the time of 
the admissible transition, and we have all the tools to investigate these interior cycles. 
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We leave the proof of the corollary to the reader. In a way such statements that involve 
convergence on a single time-scale are rather poor reflections of the complex structure of the 
behaviour of the process that is encoded in the description given in Section 4. 

Relation to spectral theory. 

Contrary to much of the work on the dynamics of spin systems we have not used the notion 
of "spectral gap" in this paper, and in fact the analysis of spectra has been limited in general 
to the rather auxiliary estimates in Section 2. Of course these approaches are closely related 
and our results could be re-interpreted in terms of spectral theory. 

Most evidently, the estimate given in Theorem 5.1 can also be seen as precise estimates 
on the largest eigenvalue of the Dirichlet operator associated with the admissible transition 
J-{x, z, y). Moreover, these Dirichlet eigenvalues are closely related to the low-lying spectrum 
of the stochastic matrix Pjy. Sharp estimates on this relation require however some work, 
and we will not pursue this analysis in this paper but relegate it to forthcoming work in 
which the relation between the metastability problem and associated quantum mechanical 
tunneling problem will be further elucidated. 

7. Mean field models and mean field dynamics 

Our main motivation is to study the properties of stochastic dynamics for a class of models 
called "generalized random mean field models" that were introduced in [BGl]. We recall that 
such models require the following ingredients: 

(i) A single spin space S that we will always take to be a subset of some linear space, equipped 
with some a priori probability measure q. 

(ii) A state space whose elements we denote by a and call spin configurations, equipped 
with the product measure Y\- q{dai). 

(in) The dual space (S^)*^ of linear maps (^^^ ■ ^ 

(iv) A mean field potential which is some real valued function Em ■ R- 

(v) An abstract probability space {Q,J^,V) and measurable maps ^'^ : Q (S^)*^ . Note that 
if IIjv is the canonical projection —>■ M.^ , then CM,Ar[^] = njvf^^[a;] o U]^^ are random 
elements of (S^)*^ . 
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(vi) The random order parameter 

mjv,MM(a) = ^^l^Ma G (7.1) 

(vii) A random Hamiltonian 

HnM^K^') = -NEm (mjv,MM(cT)) (7.2) 

In [BGl] the equiUbrium properties of such models were studied in the case where M = 
M{N) grows with N. Our aim in the long run is to be able to study dynamics in this 
situation, but in the present paper we restrict us to the case of fixed M = d. Also, we will 
only consider the case where 5 is a finite set. 

Typical dynamics studied for such models are Glauber dynamics, i.e. (random) Markov 
chains o-{t), defined on the configuration space that are reversible with respect to the 
(random) Gibbs measures 

^^^,N {a) M ^ ^ ^ (^-^^ 

and in which the transition rates are non-zero only if the final configuration can be obtained 
from the initial one by changing the value of one spin only. To simplify notation we will 
henceforth drop the reference to the random parameter u. 

As always the final goal will be to understand the macroscopic dynamics, i.e. the behaviour 
of m]v(cT(t)) as a function of time. It would be very convenient in this situation if mi^(a(t)) 
were itself a Markov chain with state space M*^. Such a Markov chain would be reversible 
with respect to the measure induced by the Gibbs measure on through the map jf^,"^, and 
this measure has nice large deviation properties. Unfortunately, mf^{a{t)) is almost never 
a Markov chain. A notable exception is the (non-random) Curie- Weiss model (see the next 
section). There are special situations in which it is possible to introduce a larger number of 
macroscopic order parameters in such a way that the corresponding induced process will be 
Markovian; in general this will not be possible. However, there is a canonical construction 
of a new Markov process on R'* that can be expected to be a good approximation to the 
induced process. This construction and the following results are all adapted from Ligget [Li], 
Section II.6. 

Let rjv(<T, a') be transition rates of a Glauber dynamics reversible with respect to the 
measure np^N, i-e. for a ^ a', rN{(T,CT') = ^J~^^^gN{(^,cr') where gN{cr,a') = gN{cr',a). 
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We denote by TZn the law of this Markov chain and by a{t) the coordinate variables. Define 
the induced measure 

Q/3,JV = l^i3,N o m'^]^ (7.4) 

and the new transition rates for a Markov chain with state space the Fjv = rnN,d{S^) (we 
drop the indices of mN,d in the sequel) by 

PN{x,y) = ^ , , fi(3^{a)rN{(7,a') (7.5) 

a:m{a)=x a' :m{a' )=y 

Theorem 7.1: Let he the law of the Markov chain x(t) with state space Fjv and tran- 
sitions rates pN{x, y) given by (7.5). Then Qp^N is the unique reversible invariant measure 
for the chain x{t). Moreover, for any a G <Sjv and D C Sn, one has 

H(sM^)n^ [r^ < rn < QpAm{<rWN [CgJ < (7.6) 

Finally, the image process m{a{t)) is Markovian and has law Fn if for all a, a" such that 
m{u) = m{a"), r{a, •) = r{a" , •). // the initial measure ttq is such that for all a, 7ro{a) > 0, 
then this condition is also necessary. 

Remark: Notice that by the ergodic theorem, we can rewrite (7.6) in the less disturbing 
form 

Em(cr) 



m(f7) , m((T) 



from which we see that the theorem really implies an ineqality for the arrival times in D 
Proof: Note that we can write 



/ ^ Qf3,N{y) -f:^ A/M^jv(cT)^/3,iv(cT') , 

which makes the reversibility of the new chain obvious. Note also that if rjv(cr, a') is constant 
on the sets m~^{x), then 

PN{x,y)= Yl rN{(T,c7')=nN[m{a{t + l))=y\a{t) = a] (7.9) 

cr' :m{a')=y 

which is only a function of m(cr). Prom this one sees easily that in this case, the law of 
m{a{t)) is P. The proof of the converse statement is a little more involved and can be found 
in [BR]. 
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Finally, the inequality (7.8) is proven in [Li] (Theorem 6.10). However, the proof given 
there is rather cumbersome, and meant to illustrate coupling techniques, while the result 
follows in a much simpler way from Theorem 6.1 in the same book. It may be worthwhile to 
outline the argument. Theorem 6.1 in [Li] states that 

M/3,iv(cr)7^^, [rS < = ^ ^^f^ ^N{h) (7.10) 

where H'^ is the set of functions 

7^2, = {/t : 5^ ^ [0, 1] : h{a) = Oy^>eDh{a') = 1} (7.11) 

and $jv is the Dirichlet form associated to the chain TZn, 

^N{h)= Yl ^ip,NicT)nN{a,a')[h{a)-h{a')f (7.12) 

Now we clearly majorize the infimum by restricting it to functions that are constant on the 
level sets of the map m, that is if we define the set 

n-^^^h-.r^^ [0, 1] : Hx) = o,y^^sHy) = 1} (7.13) 



we have that 

^inf $Ar(/i) < inf $Ar(/iom) 

But 



(7.14) 



h{x)-h{x') ^ iij3^N{o-)'R.N{(^,(r') 

x,a;'6rjv a:m{a)={x),cr' ■.m{cr')=x' 

r_ ~ -|2 _ _ 

^h{x)-h{x')^ Qi3^N{x)pN{x,x') = ^N{h) 



(7.15) 



where is the Dirichlet form of the chain P. Using the analog of (7.10) for this new chain 
we arrive at the inequality (7.6).^ 

We certainly expect that in many situations the Markov chain x{t) under the law Pjv has 
essentially the same long-time behaviour than the non-Markovian image process m{a{t)). 
However, we have no general results and there are clearly situations imaginable in which this 
would not be true. In the next section we will apply our general results to a specific model 
where this issue in particular can be studied nicely. 
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The simplest example of disordered mean field models is the random field Curie- Weiss 
model. Here 5 = {— 1, 1}, q is the uniform distribution on this set. Its Hamiltonian is 

H^Mia) = -AtI^I^M^ _ ^ 0.[^]ai (8.1) 

where 

1 ^ 

MN{<7)^-J2^i (8.2) 

i=l 

is called the magnetization. Here 9i, i £ N arc i.i.d. random variables. The dynamics of this 
model has been studied before: dai Pra and den Hollander studied the short-time dynam- 
ics using large deviation results and obtained the analog of the McKeane-Vlasov equations 
[dPdH]. Matthieu and Picco [MPl] considered convergence to equilibrium in a particularly 
simple case where the random field takes only the two values ibe (with further restrictions on 
the parameters that exclude the presence of more than two minima) . 

In this section wc take up this simple model in the more general situation where the 
random field is allowed to take values in an arbitrary finite set. The main idea here is that in 
this case we are, as we will see, in the position to construct an image of the Glauber dynamic 
in a finite dimensional space that is Markovian, while it will be possible to compare this to 
the Markovian dynamics defined on the single parameter Mjv in the manner described in the 
previous section. 

We consider the Hamiltonian (8.1) where 6i take values in the set 

H = {hi,...,hK-i,hK} (8.3) 

Each realization of the random field {^[a;]}jgN induces a random partition of the set A = 
{1, . . . , N} into subsets 

Ak[Lo] = {i e A : ei[io] = hk} (8.4) 
We may introduce k order parameters 

mk[u}]{a) = ^ ^ (^i (8.5) 

We denote by m [w] the K-dimensional vector (mi [w] , . . . , mx [ijj] ) • Note that these take values 
in the set 

Tn[uj] = ■x^=i {-pN,k['^]^-PN,k['^] + Jf^ ■ ■ ■ ^ PnA^] - jf^ PnA^]} (8-6) 
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where 

PN,kM = {^■1) 

Note that the random variables pN,k concentrate exponentially (in N) around their mean 
values ¥.hPN,k = VlOi = hk] = Pk- Obviously m'^[u]{a) = X^^i mfc[a;]((7) and m'^[uj]{a) = 
Yle=i hkfnk{cr), so that the Hamiltonian can be written as a function of the variables m[w](cr), 
via 

Hn[u]{(t) = -NE{rn[u\{a)) (8.8) 
where E : M.^ ^ M is the deterministic function 

E{x) = -\ Y,Xk\ +Y.hkXk (8.9) 
\fc=i / k=i 

The point is now that the image of the Glauber dynamics under the family of functions nii 
is again Markovian. This follows easily by verifying the criterion given in Theorem 7.1. 

On the other hand, it is easy to compute the equilibrium distribution of the variables rn[Lo]. 
Obviously, 

..,.H(mH(.) = .) ^ Q.,.H(x) = n 2-'"-'"> (4'+ :o/2) 

where ^jv[^] is the normalizing partition function. Stirling's formula yields the well know 
asymptotic expansion for the binomial coefficients 

where 

/(x) = ln(l + x) + ln(l - x) (8.12) 



is the usual Cramer entropy and 



with C{pN) a constant independent of Xk (and thus irrelevant) that satisfies C{pN) = 
0(ln(piV)). Thus 

FM[i^]{x) = --^lnQ;3,ivM(x) = Fo,ivM(x) + Fi,jvM(x) + Civ (8.14) 
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with 

1 ^ 

Fo,Nix) = -E{x) + ^ XI PN,kI{xk/pN,k) (8.15) 
^ fe=i 

Cjv = P~^Yl^=i PN,kC{pN,kN) is constant and of order IniV, and Fi_jv of order 1/N, uni- 
formly on compact subsets of F = Xk=i{~Pk,Pk)- Moreover, Fn{x) converges almost surely 
to the deterministic function 

1 ^ 

Fo{x) = -E{x) + - Y,PkKxk/Pk) (8.16) 
^ fe=i 

uniformly on compact subsets of F. The dominant contribution to the finite volume cor- 
rections thus comes from the fluctuations part of the function Fq^at, Fq^n{x) — Fq{x). One 
easily verifies that all conditions imposed on the functions Fjv in Section 1 are verified in this 
example. 

The landscape given by F. The deterministic picture. 

To see how the landscape of the function Fm looks like, we begin by studying the deter- 
ministic limiting function Fq. Let us first look at the critical points. They are solutions of 
the equation VFq{x) = 0, which reads explicitly 

d ^ 1 

0=-^ — Fq{x) = ~^^xi>-hkXk + -^Pkl'{xk/pk), k = l,...,K (8.17) 
OXk p 

or equivalently 

Xk=Pktanh{f3{m + hk)), k = l,...,K 

^ (8.18) 



m = 

k=l 



^Xk 



These equations have a particularly pleasant structure. Their solutions are generated by 
solutions of the transcendental equation 



K 



m = ^Pklimh{l3(rn + hk)) = tanh/3(m -|- /i) (8.19) 



fe=i 



Thus if m^^\ . . . , m^*") are the solutions of (8.19), then the full set of solutions of the equations 
(8.17) is given by the vectors x^^\ . . . ,x^'^^ defined by 

= pk tanh /3(m W + hk) (8.20) 
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Next we analyze the structure of the critical points. Using that I"{x) = we see that 

^ ^1 Fo{x) = -1 + ^ ,/''''''2/2^ (8-21) 
Thus at a critical point x^^\ 

dxkdxk 

where 



-Fo (xW) = -1 + 5fe,fc'Afe(mW) (8.22) 



Afe(m) = ^ (8.23) 

Ppk{l-tanh\p{m + hk))) 

Lemma 8.1: The Hessian of Fq at {x^^^) has at most one negative eigenvalue. A negative 
eigenvalue exists if and only if 

PEh (l - tanh2(/3(a;^^) + /i))) > 1 (8.24) 



Proof: Consider any matrix of the form A^k' = — 1 + Sk,k'^k with > 0. To see this, let 
{Cii ■ ■ ■ 1 Cl} denote the set of distinct values that are taken by Ai, . . . , \k- Put = {k : 
Afc = Ce} ^■nd denote by |«;^| the cardinalities of these sets. Now the eigenvalue equations read 

- {Y1^ j + - l)uk = (8.25) 

Let Q be such that \Ke\ > 1, if such a Q exists. Then we will construct — 1 orthogonal 
solutions to (8.25) with eigenvalue j = Ce- Namely, we set Uk = for all k ^ k^. The 
remaining components must satisfy YlkeKe ^fe ~ ^- -^^^ obviously, this equation has — 1 
orthonormal solutions. Doing this for every Q, we construct altogether K — L eigenvectors 
corresponding to the eigenvalues Q. Note that for all these solutions, = 0. We are left 

with finding the remaining L eigenfunctions. Now take 7 {Ci, ■ ■ ■ , Cl}- Then (8.25) can be 
rewritten as 

Uk = ^=1^ (8.26) 
Afc - 7 

Summing equation (8.26) over k, we get 



Metastability 69 

Since we have already exhausted the solutions with J2k=i ~ ^' ^® remaining 
ones the condition 

Inspecting the right-hand side of (8.28) one sees immediately that this equation has precisely 
L solutions 7j that satisfy 

71 < Ci < 72 < C2 < 73 < • • • < 7L < Ci (8-29) 
of which at most 71 can be negative. Moreover, a negative solution 7 implies that 

e=i '^^ ^ £=1 '^^ k=i '^^ 

which upon inserting the specific form of yields (8.24). On the other hand, if Y^^=i ^ ^1 
then by monotonicity there exists a negative solution to (8.28). This proves the lemma.<0> 

The following general features are now easily verified due to the fact that the analysis of 
the critical points is reduced to equations of one single variable. The following facts hold: 

(i) For any distribution of the field, there exists /?c such that: If /3 < /3c, there exists a single 
critical point and Fq is strictly convex. If /3 > (5c-, there exist at least 3 critical points, 
the first and the last of which (according to the value of m) are local minima, and each 
minimum is followed by a saddle with one negative eigenvalue, and vice versa, with possibly 
intermediate saddles with one zero eigenvalue interspersed. 

(ii) Assume (3 > (3c- Then each pair of consecutive critical points of Fq can be joined by a 
unique integral curve of the the vector field VFo{x). 

The exact picture of the landscape depends of course on the particular distribution of the 
magnetic field chosen. In particular, the exact number of critical points, and in particular 
of minima, depends on the distribution (and on the temperature). The reader is invited to 
use e.g. mathematica and produce diverse pictures for her favorite choices. We see that a 
major effect of the disorder enters into the form of the deterministic function Fq{x). Only 
a secondary role is played by the remnant disorder whose effect will be most notable in 
symmetric situations where it can break symmetries present on the level of Fq. 

Fluctuations 
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In the present simple situation it turns out that the fluctuations of the function -Fo,Ar can 
also be controlled in a precise way. We will show the following result. 

Proposition 8.3: Let gk,k = 1, . . . ,K be a family of independent Gaussian random vari- 
ables with mean zero and variance pk{l—Pk) ■ Then the function ■\/N[Fn{x)—Fo{x)] converges 
in distribution, uniformly on compact subsets of F to the random function 

4 XI fi'fc ( ^I'{xk/pk) - lixk/pkU (8.31) 
P fc=i \Pk J 



Proof: Since Fjv — -Fo,iv converges to zero uniformly, it is enough to consider 

Fo,Nix) - Fo{x) = ^^{pN,kI{Xk/PN,k) - Pklixk/pk)) 

k 

= J^^({PN,k-Pk)I{Xk/Pk) +Pk{I{Xk/PN,k) - I{Xk/Pk)) C^-^^) 

' k 

+ {PN,k -Pk){I{Xk/pN,k) - I{Xk/pk))^ 

Now in the interior of T we may develop 

I{Xk/pN,k) - Hxk/Pk) = {,PN,k - Pk)\l' {Xk/Pk) + 0{{pN,k - PkT) (8.33) 

Pk 

Now the pN,k are actually sums of independent Bernoulli random variables with mean pk-, 
namely pN,k = jf J2iLi ^hk,Oi- Thus, by the exponential Chebyshev inequality, 

V [\pN,k-Pk\ > e] < 2exp {-NI^^{e)) (8.34) 

where /p(e) > is a strictly convex function that takes its minimum value at e = 0. Thus 
with probability tending to one rapidly, wc have that e.g. {pN.k —Pk)^ ^ _/V~3/4 ^j^j^j^ allows 
us to neglect all second order remainders. Finally, by the central limit theorem the family 
of random variables \/N{pN.k — Pk) converges to a family of independent Gaussian random 
variables with variances Pk{^ — Pk)- This yields the proposition. <) 

Relation to a one-dimensional problem. 

We note that the structure of the landscape in this case is quasi one-dimensional. This 
is no coincidence. In fact, it is governed by the rate function of the total magnetization, 
— ^///3^Ar(?Ti^(c7) = m) which to leading orders is computed, using standard techniques, as 

m2 / 1 ^ \ 

Go^N{m) = — — + sup mi - - V] PAr,fc lncosh/3(/ifc +t) (8.35) 
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The most important facts for us are collected in the following Lemma. 



Lemma 8.2: The functions Gq^n and -Fq.jv CLfe related in the following ways. 



(i) For any m € [—1, 1], 



Go,N{m) = 



inf 



Fo,n{x) 



(8.36) 



Xk=m 



(a) If X* is a critical point ofF^^N, then m* = Ylk^k ^ critical point ofGo^N- 

(Hi) If m* is a critical point ofGo^N, then x*{m*), with components xl{m) = piv,fc tanh/3(m* + 
hk) is a critical point o/Fq^jv- 

(iv) At any critical point m* , Go,Ar(?Ti*) = Fo^n{x* (m*)). 

The prove of this lemma is based on elementary analysis and will be left to the reader. 

The point we want to make here is that while the dynamics induced by the Glauber 
dynamics on the total magnetization is not Markovian, if we define a Markov chain m{t) that 
is reversible with respect to the distribution of the magnetization in the spirit of Section 7 
and compare its behaviour to that of the Markov chain rn{t) = rn(a(t)), the preceding result 
assures that their long-time dynamics arc identical since all that matters arc the precise 
values of the respective free-energies at its critical points, and these coincide according to the 
preceding lemma (up to terms of order and the asyniptotics given in (8.12), (8.13), up 

to (i^-dependcnt) constants). In other words, the two dynamics, when observed on the set 
of minima of their respective free energies, arc identical on the level of our precision. 
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